The Deliverable Contract System — Implementation Plan

What the red-teams changed

Two independent adversarial passes (a hostile pre-mortem and a skeptical principal-architect review) reshaped the v0.1 plan. The honest corrections — including two to what I recommended earlier — are below.

1 · The extractor, not the guard, is make-or-break

v0.1Detect explicit forms — "in PandaDoc", "send via X". Phase 1 treated as a 5-line add-on.

HardenedA TOOL_PIN extractor (stdlib regex + capability-ledger inference for implicit deliverable nouns: contract, proposal, SOW). Go/no-go gate: ≥80% recall on 30 real Victor utterances before any guard is built.

2 · Extend the live system — don't build a parallel one

v0.1New deliverable_contract.py, new gate, new state file — alongside the existing scope-contract gate.

HardenedWe already run scope-contract-extract.py + scope-contract-gate.py in enforce mode. Add a TOOL_PIN clause + fields to the existing active-contract.json and extend the existing gate. One system, one audit trail.

3 · Correction: Letta comes off the enforcement path

Earlier rec"Letta-backed Deliverable Contract" as the cross-surface store.

HardenedLetta is a cloud memory store (already pulled from one path for reliability) — wrong primitive for a safety gate. Cross-surface sync moves to the filesystem (/Users/Shared/claude-coord/, already used for this). Letta stays as optional narrative memory only, never on the enforcement read path.

4 · Correction: the LLM-judge is deferred, not shipped

Earlier recDeepEval/Gemini judge as a core "best, not cheapest" layer.

HardenedIt's a category mismatch (a semantic judge for a channel-compliance problem), a client-data egress risk, and 4 uninstalled deps. Channel compliance is a deterministic tool-call diff (stdlib). The judge returns only if evidence shows real content-quality misses — and then as a local judge.

5 · Best-on-OpenClaw rests on a stronger pillar: proactive pre-steering

v0.1OpenClaw advantage = "Letta store" + per-turn recitation.

HardenedThe brain's message-assembly path lets us inject a tool-restriction directive upstream of the model's tool choice ("pinned: PandaDoc — do not call deploy_to_vercel this turn") — preventing the substitution from being attempted. clawed structurally can't do this. The guard becomes the safety net, not the front line.

6 · Self-heal moves first; approval UX gets a voice-friendly path

v0.1Self-heal at Phase 4; approval via a file the blocked agent somehow writes.

HardenedSelf-heal is Phase 1 (the .bak graveyard proves guards get unwired). Approval is an inline SWAP: <tool> reply — detected on the next turn, works by voice, single-use, session-scoped. No file-manager step, no trust-collapse-and-disable spiral.

What we reuse vs. build

You asked us to find who's already solved this and repurpose their code. The spine ships with zero new external dependencies — it extends our own substrate, plus a handful of patterns lifted from proven repos.

Need	Repurpose	Reuse
Contract extraction + gate	our own `lib/scope_contract.py` + `scope-contract-gate.py` (in production, enforce mode)	~70%
Self-healing invariant	our `standing_orders.py` `order()` + `_settings_ensure`	direct
Safe gate deploy	our `atomic-guard-publish.py` + `guard_mutation_lock`	direct
Cross-surface sync	`/Users/Shared/claude-coord/` filesystem dir (already the coordination layer)	direct
Alerts	our `lib/alert_route.py` (Gmail digest)	direct
Guard skeleton (check action vs state)	nizos/tdd-guard pattern MIT	pattern
Per-turn re-injection (clawed)	johnlindquist N-prompt-counter gist	~80%
Session handoff schema	AnastasiyaW session-handoff schema	pattern

Cut from the spine (kept the design honest): LangGraph (forces a graph-rewrite of a custom Python fabric), NeMo Guardrails (Colang overhead), Outlines (needs logit access — N/A for Claude API), DeepEval / Guardrails AI / Instructor / promptfoo (external deps that rot; the spine is stdlib), Letta-on-the-enforcement-path, OpenAI Evals (deprecated), Braintrust (commercial). Each was evaluated and rejected for a specific reason — not overlooked.

The roadmap

Six phases, re-sequenced for fastest risk reduction: prove the tripwire, make it self-healing, then add teeth, then make it best-on-OpenClaw, then prove it stays honest.

~1 day

Extend the schema — no new system

Build: add a TOOL_PIN clause type to lib/scope_contract.py; add pinned_tool / pinned_format / pinned_channel / approval_token / status / version to the existing active-contract.json; atomic versioned write; mirror to /Users/Shared/claude-coord/contracts/active.json.

Repurpose: the live scope-contract object + our atomic-publish/lock primitives. Zero new deps.

Verify · round-trip a contract local↔mirror; concurrent-write test proves the version lock rejects stale writes.

~1–2 days

The tripwire — extractor recall GO / NO-GO

Build: TOOL_PIN regex for explicit forms (via/in/using/as/through <tool>) + capability-ledger inference for implicit deliverable nouns (contract → PandaDoc). Multi-turn elaboration updates an open contract.

The gate: build a labeled corpus of 30 real Victor utterances (from session transcripts + InsightsLM) that should pin a deliverable. Require ≥80% recall before building any enforcement. If the tripwire can't fire, no guard matters.

Verify · "draw up the proposal for Kai" and "put the contract in PandaDoc" both produce a contract with the right pinned_tool.

~half day

Self-heal first durability

Build: register order("so-deliverable-contract", …, auto_heal=True) in standing_orders.py; re-wire the gate into both surfaces' settings via _settings_ensure; add an FSEvents watcher on settings for <5s re-wire (not just the 600s poll). Deploy via atomic-guard-publish.py.

Why before the guard: the .bak graveyard proves guards get unwired by unrelated changes. Wire self-heal first so everything after is covered from day one.

Verify · delete the gate from clawed settings → reconciler restores it within one cycle; heal-success is silent, only heal-FAILURE alerts.

~1–2 days

The teeth — PreToolUse + Stop backstop

Build: extend scope-contract-gate.py (not a parallel gate). PreToolUse: deliver-tool list driven by capability_ledger at runtime (no static allowlist rot); inspect tool_input + transcript for Bash/curl bypass to known endpoints (pandadoc.com, vercel API). Deny a mismatched tool with the deviation-flag reason.

Stop backstop: deterministic tool-call diff vs pinned_tool; on block, attempt reversible rollback (e.g. vercel rm); on repeat (stop_hook_active) write HALT + alert rather than silently exit. Approval: inline SWAP: <tool> reply → single-use session token.

Verify (falsifying test) · replay the real failure: contract pins PandaDoc, agent attempts Vercel → PreToolUse denies; turn ends with only a Vercel page → Stop blocks. A valid SWAP: passes.

OpenClaw

Wire into the brain _enforce_output Stop-subset + the dispatch tool path. Dispatched sub-tasks read the contract at startup (contract-passing).

clawed

Gate in ~/.claude/learning-substrate/gates/, wired in settings.json. Parity verified independently (don't assume shared backend carries).

~half day

Best-on-OpenClaw — proactive pre-steering OpenClaw edge

Build: _inject_contract_directive() in letta-steve-mcp's ask_steve(), upstream of the model call — when a contract is open, inject "pinned: PandaDoc; do not call deploy_to_vercel this turn." Plus an idempotent <system-reminder> recitation each turn.

clawed: idempotent-hash UserPromptSubmit injection (dodges the additionalContext accumulation bug #40216) — the recitation half only; clawed can't pre-steer tool choice the way OpenClaw can.

Verify · with a PandaDoc contract open, OpenClaw does not attempt a Vercel deploy (substitution prevented, not just caught).

~1 day

Eval + canary — prove it stays honest

Build: a replayable corpus (the PandaDoc case + adversarial near-misses: on-target alternative, verbose decoy, Bash-bypass, multi-turn pivot). A --selftest on the gate that exits non-zero on any miss, wired into guard-selftest-supervisor.py; a weekly canary. Stdlib only.

Verify · ≥95% catch on substitution attempts, <2% false-block on legitimate/approved deliveries (tunable).

Total: ~5–6 working days, zero new external dependencies in the enforcement path, built almost entirely on substrate that already exists and self-heals.

The DeliverableContract System.

The two red-teams agreed on the real risk: the guard was never the hard part — the tripwire is. You don't say "in PandaDoc." You say "draw up the proposal for Kai."