Files
pm-template/docs/reltio-forge.md

235 lines
8.3 KiB
Markdown
Raw Permalink Normal View History

# Reltio Forge
*The velocity layer for AI-accelerated software delivery.*
Transparent Factory defines **what** safe AI-assisted development looks like.
Reltio Forge defines **how** you enforce it at speed — daily releases, AI-written code, enterprise guardrails.
---
## The Problem
AI coding tools (Codex, Claude Code, Cursor, Copilot) can generate thousands of lines per day. Teams adopting them face a paradox: **velocity without guardrails is liability, but guardrails without automation kill velocity.**
The fastest-moving open source projects (OpenClaw: 247K stars, 390K CI runs, daily releases, mostly AI-written) prove this is solvable. They ship daily because their safety net is automated, not manual.
Reltio Forge codifies these patterns for enterprise teams already following Transparent Factory tenets.
---
## Relationship to Transparent Factory
| Layer | Scope | Analogy |
|-------|-------|---------|
| **Transparent Factory** | Architectural tenets (what to build) | Building code |
| **Reltio Forge** | Delivery pipeline (how to ship safely) | Inspection process |
Transparent Factory's 5 tenets remain the standard. Forge wraps each one in automated enforcement:
| Tenet | Forge Enforcement |
|-------|-------------------|
| Atomic Flagging | CI gate: no deployment without flag wrapper. Flag TTL enforced by cron reaper. |
| Elastic Schema | Migration linter in CI. Additive-only check. Dual-write test harness. |
| Cognitive Durability | ADR-or-reject: PRs touching architecture require an ADR file or are blocked. |
| Semantic Observability | Span coverage gate: new endpoints must emit reasoning spans or CI fails. |
| Configurable Autonomy | Governance-as-code: autonomy levels declared in config, validated at deploy. |
---
## The 6 Forge Principles
### 1. Gate Before Merge, Not After Deploy
Every PR passes through an automated gauntlet before a human sees it:
```
Cheap & Fast Expensive & Slow
─────────────────────────────────────────────────────────►
Types → Lint → Secrets → Build → Unit → E2E → AI Review
~30s ~30s ~10s ~60s ~90s ~3m ~2m
```
Fail-fast ordering: if types are broken, don't waste 5 minutes on E2E.
*Inspired by: OpenClaw CI scoping — detects what changed, skips irrelevant jobs, runs cheap checks first.*
### 2. AI Reviews AI
When code is AI-generated, human review alone is insufficient at scale. Add an AI reviewer in CI that:
- Checks for Transparent Factory tenet compliance
- Flags security anti-patterns (SQL injection, hardcoded secrets, missing auth)
- Validates schema changes are additive
- Produces a structured review (not just "LGTM")
The human reviewer then reviews the AI review + the code. Two-layer defense.
*Inspired by: OpenClaw's `claude-code-review.yml` GitHub Action on every PR.*
### 3. Formal Models for High-Risk Paths
Not everything needs formal verification. But the paths where a bug means data breach, money loss, or compliance failure? Those get TLA+ (or Alloy, or property-based tests at minimum).
**Forge rule:** identify your top 5 "if this breaks, we're on the news" paths. Write executable models for them. Run them in CI. Maintain both "green" (invariant holds) and "red" (invariant breaks under known bug class) models.
*Inspired by: OpenClaw's TLA+ models for session isolation, pairing, ingress gating, routing, and tool execution.*
### 4. Three Channels, One Truth
Ship to `beta` first. Soak. Promote to `stable`. Never skip.
```
main ──► beta (automated) ──► stable (promoted after soak)
└── canary (optional: % rollout)
```
- Tags are immutable
- Promotion is a metadata operation, not a rebuild
- Rollback = promote the previous stable tag
*Inspired by: OpenClaw's stable/beta/dev channels with npm dist-tag promotion.*
### 5. Trust Boundaries as Code
Define explicit trust boundaries in your architecture. Each boundary is a policy enforcement point:
```
Untrusted Input → [Boundary 1: Validation] → Business Logic → [Boundary 2: Authorization] → Data → [Boundary 3: Execution Sandbox]
```
**Forge rule:** every service declares its trust boundaries in a `TRUST.md` or equivalent config. CI validates that boundary enforcement code exists at each declared point.
*Inspired by: OpenClaw's three trust boundaries (channel access → session isolation → tool execution sandbox) documented in their MITRE ATLAS threat model.*
### 6. Configurable Risk Tolerance
Not every team, environment, or customer has the same risk appetite. Forge doesn't mandate a single posture — it mandates that the posture is **explicit and configurable**:
- `strict`: all gates required, no overrides, formal models must pass
- `standard`: all gates required, break-glass override with audit trail
- `experimental`: gates advisory-only, all violations logged but non-blocking
The posture is declared per-environment, per-service, or per-tenant. Never implicit.
*Inspired by: OpenClaw's tool profiles (messaging vs coding), exec approvals, dmPolicy, and allowlists — all configurable per agent.*
---
## Forge Pipeline (Reference Implementation)
```yaml
# .forge/pipeline.yml
version: 1
posture: standard # strict | standard | experimental
gates:
# --- Fast gates (< 2 min) ---
types:
tool: tsc --noEmit
fail: block
lint:
tool: eslint + prettier
fail: block
secrets:
tool: gitleaks / trufflehog
fail: block
# --- Build gate ---
build:
tool: docker build / npm run build
fail: block
depends_on: [types, lint, secrets]
# --- Test gates (2-5 min) ---
unit:
tool: vitest --run
coverage_threshold: 80%
fail: block
depends_on: [build]
e2e:
tool: vitest --run --config vitest.e2e.config.ts
fail: block
depends_on: [build]
# --- AI gates (2-3 min) ---
ai_review:
tool: claude-code-review
checks:
- transparent_factory_compliance
- security_anti_patterns
- schema_additive_only
- adr_required_for_architecture
fail: block # standard: block. experimental: warn.
depends_on: [build]
# --- Formal gates (optional, for declared high-risk paths) ---
formal:
tool: tlc
models_dir: .forge/models/
fail: block
depends_on: [build]
when: changed(.forge/models/) OR changed(src/auth/) OR changed(src/data/)
# --- Transparent Factory tenet enforcement ---
tenets:
atomic_flagging:
check: grep-based or AST check for flag wrappers on new features
ttl_reaper: cron job that alerts on flags past 14-day TTL
elastic_schema:
check: migration linter (additive-only, no column drops without dual-write)
sla: 30 days for migration completion
cognitive_durability:
check: PRs touching src/arch/ or adding new services require docs/adr/*.md
semantic_observability:
check: new API endpoints must have tracing spans (grep for span creation)
configurable_autonomy:
check: governance config exists and is valid JSON/YAML schema
```
---
## Adoption Path
### Week 1-2: Foundation
- Add fail-fast CI gates (types → lint → secrets → build → test)
- Declare trust boundaries in each service
- Set posture to `experimental` (non-blocking)
### Week 3-4: Enforcement
- Add AI code review to PR pipeline
- Add Transparent Factory tenet checks
- Promote posture to `standard`
### Month 2: Hardening
- Identify top 5 high-risk paths
- Write formal models (TLA+ or property-based tests)
- Add beta/stable channel promotion
- Enable break-glass audit trail
### Month 3+: Maturity
- Promote posture to `strict` for production services
- Publish Forge compliance dashboard
- Integrate with Reltio's existing release governance
---
## Why This Works
OpenClaw ships daily with AI-written code because:
1. **Automated gates catch 95% of problems** before a human looks at it
2. **Trust boundaries limit blast radius** when something slips through
3. **Configurable posture** means teams adopt incrementally, not all-or-nothing
4. **Formal models** provide mathematical confidence on the paths that matter most
Reltio Forge takes these patterns and wraps them around the Transparent Factory tenets you already have. The tenets don't change. The enforcement becomes automatic.
*The factory builds the product. The forge tempers the steel.*