Skip to content

feat(supervise): delegate(intent) over supervise() — generic delegation verb, cost rides through (+ docs freshness fix)#355

Merged
drewstone merged 2 commits into
mainfrom
feat/delegate-over-supervise
Jun 22, 2026
Merged

feat(supervise): delegate(intent) over supervise() — generic delegation verb, cost rides through (+ docs freshness fix)#355
drewstone merged 2 commits into
mainfrom
feat/delegate-over-supervise

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What

Adds delegate(intent, opts) — the one generic delegation verb — as a thin front door over supervise(), plus the agent-facing delegate MCP tool. You hand it an INTENT (the outcome you want); a default authoring supervisor (router-brained, harness: null, systemPrompt = supervisorInstructions()) AUTHORS and spawns whatever worker that intent needs over the conserved-budget pool. No hardcoded coder/researcher profile — the supervisor writes its own. It returns supervise()'s SupervisedResult unchanged, so spentTotal (iterations / tokens / usd / ms) rides straight back — the cost channel delegate_code lacks.

Design

  • delegate() (src/runtime/supervise/delegate.ts): builds the default authoring-supervisor profile and calls supervise(). Reuses the conserved-budget pool, the completion oracle (DeliverableSpec), the coordination toolbox, and equal-compute accounting — no hand-rolled driver, spawn loop, or equal-k.
  • delegate MCP tool (src/mcp/tools/delegate.ts): the generic agent-facing replacement for delegate_code / delegate_research. The supervisor substrate (router / backend / deliverable) is injected at server construction (exactly as delegate_code injects its CoderDelegate); the agent supplies only the intent. Synchronous: returns the delivered output WITH its spentTotal.
  • Exports: delegate / DelegateOptions / defaultDelegateBudget from the runtime barrel (/loops, next to supervise); the delegate tool + types from the mcp barrel.

Purely additive. delegate_code / delegate_research / composeProductionAgentProfile / detachedSessionDelegate / coderProfile are all untouched.

The no-winner cost fix (the binding correctness fix)

The no-winner variant of SupervisedResult had no spentTotal field — but real conserved compute is spent before a run fails. So a failed delegation reported its cost as absent, and the delegate MCP tool fabricated a zero (spentTotal: zeroSpend). That defeats delegate()'s whole point: the caller learns the cost.

  • spentTotal: Spend is now a required field on the no-winner variant, computed off the same journal the winner path reads (spentFromJournal), DRY'd into one noWinner() builder over the two no-winner exit points.
  • The delegate MCP tool returns the real result.spentTotal on no-winner; the fabricated zeroSpend constant is removed.
  • BYOK nuance: a worker's in-box model spend is often the customer's key, not Tangle's. The meaningful no-winner cost is the driver/supervisor spend (router-side, Tangle-billed), which spentFromJournal's conserved-pool sum already captures — the right number.

Cost now rides back on both the winner and no-winner paths. Covered by new unit tests in tests/loops/delegate.test.ts, tests/mcp/delegate.test.ts, and tests/loops/supervise.test.ts (no-winner spentTotal is well-formed and non-negative).

Docs freshness fix (the red CI on the last push)

pnpm docs:check was red on main, pre-existing from #352, not introduced here:

  • docs/canonical-api.md version pin read 0.70.0 while package.json is 0.70.1 → bumped.
  • The ./platform export restored in feat(platform): restore ./platform export (0.70.1) #352 had no typedoc entryPoint → registered src/platform/index.ts in typedoc.json, regenerated docs/api/ (adds platform.md + the new delegate symbols).

pnpm docs:check now passes.

E2E (real router, proof the cost rides through)

examples/delegate/e2e-delegate-real.ts: a router-brained supervisor authors + spawns a worker (no hardcoded coder profile), the worker does real filesystem work via a granted write_file tool, and the deliverable gate reads disk (ground truth). Run against the live Tangle router (deepseek-v4-flash):

result.kind  : no-winner
file exists  : true  @ /tmp/delegate-e2e-*/out.txt
file content : "hello"
spentTotal   : {"iterations":2,"tokens":{"input":32256,"output":3263},"usd":0.0132661,"ms":3427}

Proven live: the supervisor authored + spawned a worker (no coderProfile); the worker did real work — out.txt containing hello on disk; and result.spentTotal carries real conserved cost ($0.013, 35k tokens) on the no-winner path — the exact field this PR adds.

The no-winner (not winner) outcome is a supervise() brain/completion-gate behavior, not a delegate() defect: delegate() returns supervise()'s result unchanged. The supervisor brain node (deepseek-v4-flash) failed to settle a winner after spawning; the worker's deliverable was met on disk regardless. A stronger brain model settles a winner when the router's tool-call path is healthy (it was intermittently returning 500s on the stronger models during the run). The delegate()-scoped claim — routing + cost-passthrough — is proven.

Verification

  • typecheck: 0 errors
  • lint (Biome): clean
  • build (tsup): ok
  • tests: 1076 passed, 1 skipped
  • pnpm docs:check: passes

Add delegate(intent, opts) — the one generic delegation verb, a thin front
door over supervise(). It hands the INTENT to a default authoring supervisor
(router-brained, harness null, systemPrompt = supervisorInstructions()) which
AUTHORS and spawns whatever worker the intent needs over the conserved-budget
pool. No hardcoded coder/researcher profile; the supervisor writes its own.

Returns supervise()'s SupervisedResult unchanged, so spentTotal (iterations,
tokens, usd, ms) rides straight back — the cost channel delegate_code lacks.

Add the delegate MCP tool (createDelegateHandler) as the agent-facing generic
replacement for delegate_code/delegate_research: it routes to delegate(), takes
the supervisor substrate (router/backend/deliverable) injected at server
construction (like coderDelegate), and returns the delivered output WITH its
spentTotal synchronously.

Purely additive — delegate_code/delegate_research/composeProductionAgentProfile/
detachedSessionDelegate/coderProfile are all untouched. Reuses supervise(),
the authoring skill, the conserved-budget pool, DeliverableSpec; no hand-rolled
driver, spawn loop, or equal-k.

Exports: delegate/DelegateOptions/defaultDelegateBudget from runtime barrel
(/loops, next to supervise); the delegate tool + types from the mcp barrel.
Regenerated docs/api/ for the new symbols.
…rsion pin + platform entryPoint)

The no-winner variant of SupervisedResult lacked a spentTotal field even though
real conserved compute is spent before a run fails — so a failed delegation
reported its cost as absent (and the delegate MCP tool fabricated a zero). Add
spentTotal as a required field on the no-winner variant and compute it off the
same journal the winner path reads (spentFromJournal), DRY'd into one noWinner()
builder over the two no-winner exit points. The delegate MCP tool now returns the
real result.spentTotal on no-winner; the fabricated zeroSpend constant is removed.
This is delegate()'s whole point: the cost rides back on BOTH paths.

Docs freshness gate (pnpm docs:check) was red on main from #352: the canonical-api
version pin read 0.70.0 while package.json is 0.70.1, and the restored ./platform
export had no typedoc entryPoint. Bump the pin to 0.70.1, register
src/platform/index.ts in typedoc.json, regenerate docs/api/ (adds platform.md +
the new delegate symbols).

Add examples/delegate/e2e-delegate-real.ts: a router-brained supervisor authors +
spawns a worker (no hardcoded coder profile), the worker does real filesystem work,
the deliverable gate reads disk, and result.spentTotal carries the real conserved
cost on both winner and no-winner paths.
@drewstone drewstone marked this pull request as ready for review June 22, 2026 08:19

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — bad18b92

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-22T08:19:56Z

@drewstone drewstone merged commit 2879849 into main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants