Skip to content

Server-side context graph: GitHub capture + T0 projection + neighbors#6

Open
philcunliffe wants to merge 3 commits into
mainfrom
feat/server-side-graph
Open

Server-side context graph: GitHub capture + T0 projection + neighbors#6
philcunliffe wants to merge 3 commits into
mainfrom
feat/server-side-graph

Conversation

@philcunliffe

Copy link
Copy Markdown
Contributor

What & why

The server could collect forwarded logs but couldn't host a graph: its kernel plugin
set is hardcoded and the admin attach is SQL-only. This adds a server-side context
graph
— the server captures GitHub activity directly, projects both GitHub events and
forwarded LLM sessions into one node/edge graph, and answers graph queries. That
enables the GitHub↔LLM convergence 1.6's git-bridge was built for (sessions and GitHub
activity share content-addressed Repo/Commit/File nodes).

Design rationale lives in llp/0010-server-side-graph.decision.md — not duplicated here.

Changes

  • Vendor @hypaware/github under plugins/github/ (no git remote yet) and load it +
    bundled @hypaware/context-graph + @hypaware/ai-gateway-graph into the server kernel
    (src/boot.js). The github poll source stays dormant; capture is an admin one-shot.
  • New admin ops + CLI: github-backfill, graph-project, graph-neighbors
    (src/http/routes-admin.js, bin/admin.js) — reuse the plugins' pure functions over the
    server kernel's query/storage handles (src/daemon.js, re-exported via src/kernel/shim.js).
  • [github] config from HYPSERVER_GITHUB_* env; the GitHub token stays in the box env,
    never in config (src/config.js, src/types.d.ts).
  • Registry fix: self-managed graph datasets (github_events/node/edge) keep their own
    read closures instead of the date= partition synthesis used for wire ingest
    (src/catalog/registry.js).
  • Flush github_events after backfill so captured rows are immediately queryable (src/daemon.js).
  • LLP 0010 + @ref annotations; smoke extended to drive the full
    backfill → project → neighbors chain hermetically (in-memory GitHub client, no network).

Two non-obvious bugs the new smoke test caught

  • Capture's appendRows buffers in the cache writer — needed an explicit flush or rows aren't queryable.
  • The catalog registry overwrote the graph datasets' own read closures with date= synthesis
    (which only fits forwarded ingest) → projection/neighbors silently read zero rows.

Testing

  • npm run smoke: 92 checks pass locally and inside the built linux image.
  • Deployed live to hypebox-1 (data volume kept, no wipe): backfilled 17,477 github_events
    across 12 repos → projected 13,607 nodes / 45,001 edges; graph-neighbors verified;
    cross-source convergence (Session ↔ Repo/Commit/File) confirmed once LLM logs forwarded.

Notes / follow-ups (non-blocking)

  • plugins/github/ is vendored third-party code; its @refs resolve against the
    @hypaware/github corpus, so exclude it from this repo's /ref-check (noted in LLP 0010).
  • The admin CLI (bin/admin.js, undici) times out the client at ~300s on long ops like an
    org-wide backfill; the server handler completes anyway. Worth a longer headersTimeout or an
    async backfill + status endpoint.
  • The live run used a GitHub PAT scoped to 12 repos; broaden it + re-backfill to cover the org.
  • Observed in the live run: clients forward their local github_events too and the server
    accepted them. Decide whether that's desired (the server pulls its own) or should be rejected.

🤖 Generated with Claude Code

philcunliffe and others added 3 commits June 17, 2026 13:18
The end-to-end smoke test was failing — partly flaky, partly broken —
from three independent causes. It exits on the first failed check, so
each masked the next; this fixes all three.

1. Mover "run now" could silently no-op (the real ~50% flake).
   mover.tick() had a `running` guard that made a *concurrent* call
   return 0 immediately. The 200ms background timer and the admin
   /v1/admin/mover/run endpoint both call it, so when runMover() landed
   mid-pass it returned 200 without committing the just-spooled row, and
   the follow-up query saw 0 rows. Split into opportunistic tick() (the
   timer keeps skipping, never piles up) and guaranteed drain() (waits
   out any in-flight pass, then runs a fresh pass whose pending()
   snapshot is guaranteed to include rows spooled just before the call).
   The admin endpoint and the shutdown drain now use drain().

2. Config pin format diverged from the kernel's config wire schema.
   The save pipeline emitted a lock-file-shaped plugin entry into a
   config document — object `source` ({kind,raw,path}) and `content_hash`
   — but the kernel's parseConfigShape requires `source` to be a string
   and the client verifies the pin under `artifact_hash` (hypaware
   config/apply_deps.js). The server thus produced a document its own
   shape parser rejects on re-submission. Emit `version` +
   `artifact_hash`, keep `source` as the operator's raw string the client
   re-resolves; validatePrePinned validates that shape.

3. Smoke proxy row predated the ai_gateway_messages schema. That dataset
   gained a required non-null `session_id` column (schema v6); add it.

Verified: 50/50 runs green (was ~50% flaky).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ghbors)

Load @hypaware/context-graph and a vendored @hypaware/github into the
server's own kernel so the server can capture github_events directly,
project the T0 graph, and answer neighbors queries over its own cache —
where the forwarded 1.6 LLM logs also live, enabling GitHub<->LLM
convergence (hypaware LLP 0032).

- boot: activate context-graph + github (poll source dormant); inject the
  [github] section from HYPSERVER_GITHUB_* env, token stays in box env
- daemon: services.githubBackfill/graphProject/graphNeighbors reuse the
  plugins' pure functions over the kernel query+storage; flush github_events
  after capture so it is immediately queryable
- routes-admin + admin CLI: github-backfill, graph-project, graph-neighbors
- registry: self-managed graph datasets keep their own read closures
  (source=/graph_v1 layout), not the date= synthesis used for wire ingest
- shim: re-export projectGraph/queryNeighbors/requireGraphRuntime anchored
  on bundledWorkspaceDir for module-singleton identity
- smoke: full hermetic backfill->project->neighbors chain (no network)
- LLP 0010 documents the decision; plugins/github vendored (own ref corpus)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Maps forwarded ai_gateway_messages into the same node/edge graph; its
bridge-ready Repo/Commit/File keys converge by content-addressed id with
github's, so one graph-project spans both sources.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@philcunliffe

Copy link
Copy Markdown
Contributor Author

Dual-agent review — request_changes

  • Verdict: request_changes
  • Risk class: medium
  • Auto-merge advisory: 👎 thumbs down — verdict is request_changes; needs human-gated follow-up

Advisory only: no merge was attempted.

⚠️ Codex reviewer was unavailable — its CLI routes through the local gateway proxy (127.0.0.1:8787), which dropped the response stream mid-flight on two attempts (stream disconnected before completion, 5 reconnects each). This verdict rests on the Claude review alone (5 parallel subagents: guidance, bug-scan, history, contracts, comments/tests). Three of those subagents independently ran test/smoke.js end-to-end — 92/92 checks pass.

Risk capstone

Cross-reference: reviewer findings vs high-risk surfaces

Source Finding (severity, evidence) Intersects
Claude (comments/tests) Incremental poll / cursor path untested — minor, conf 90 (smoke.js:560-605; capture.js:341-343,370; cursors.js:9-13) Risks: incremental-capture surface; Concurrency: github capture tick
Claude (comments/tests) graph/neighbors filters untested — minor, conf 88 (smoke.js:597; routes-admin.js:2990-3012) Risks: incremental-capture surface; Direct callers: daemon→queryNeighbors
Claude (comments/tests) bin/admin.js graph CLI wrappers untested — minor, conf 92 (bin/admin.js:21-33) Risks: incremental-capture surface; Direct callers: admin CLI↔routes
Claude (comments/tests) Multi-repo/org/ignore + per-repo error isolation untested — minor, conf 90 (capture.js resolveRepos/captureRepos) Risks: incremental-capture surface; Concurrency: capture errors[]
Claude (comments/tests) graph/project --source + dry_run untested — minor, conf 85 (daemon.js:166-167) Risks: incremental-capture surface; Cross-package: server→context-graph
Claude (bug-scan, sub-threshold) Poll-mode 304 comment misclassification — minor, conf 68 (capture.js:458-465,430-432) Risks: incremental-capture surface
Claude (contracts, sub-threshold) Stale content_hash log-field label — nit, conf 70 (save-pipeline.js:220) Direct callers: save-pipeline pin fields
Claude (guidance, sub-threshold) Vendored plugins/github/ Code-Style + broken @refs — minor, conf 50-55 Cross-package: plugin manifest→loader (vendored carve-out, LLP 0010)
Codex (codex unavailable — gateway-proxy stream disconnect, 2 attempts)
  • Codex review: (unavailable — see note above)
Claude review

Claude review

Five parallel review subagents covered guidance-compliance, a shallow bug scan,
git-history regression analysis, contract/caller consistency, and comments+tests.
Three of them independently ran test/smoke.js end-to-end — all 92 checks pass,
including the new github-backfill → graph-project → graph-neighbors chain (checks
77–88). No logic bug, contract mismatch, or history regression survived scrutiny.
Every surviving finding below is a test-coverage gap on the new GitHub-capture
surface; all are scored ≥80 confidence but are minor in severity because no defect
was demonstrated on any exercised path and the most exposed path (incremental
polling) is dormant in the current server config (poll_interval is never set).

Findings the subagents raised that did not clear the ≥80 bar (recorded for the
risk cross-reference, not as blockers): vendored plugins/github/ Code-Style
deltas — inline import('...') types, a @typedef, and @refs that resolve
against a different corpus (conf 50–55, explicitly carved out by LLP 0010 as a
vendored tree); a poll-mode comment-misclassification latent bug on a 304 pulls
listing (conf 68, dormant path); and a stale content_hash log-field label in
save-pipeline.js:220 (nit, conf 70, log key only — not the wire pin).

Incremental poll / cursor-advancement path has no test coverage

  • Severity: minor
  • Confidence: 90
  • Evidence: test/smoke.js:560-605 (backfill-only); plugins/github/src/capture.js:341-343,370 and plugins/github/src/cursors.js:9-13 (untested high-water/resume logic)
  • Why it matters: smoke only ever calls mode:'backfill', which resets the cursor to {}; the since high-water, advancePullsHigh/changedSince, and the cursor sidecar read/write that drive incremental capture are never asserted — a regression in cursor math (and the related conf-68 304 comment-misclassification bug) would ship green. The path is dormant today, so this is forward-looking risk, not a current defect.
  • Suggested fix: add a second poll-mode tick after backfill with a fixture whose updated_at advances, asserting only-new rows are appended and github-cursors.json carries the high-water; this also exercises the 304/comment-type path.

graph/neighbors filter parameters untested

  • Severity: minor
  • Confidence: 88
  • Evidence: test/smoke.js:597 (single {type:'Repo',depth:2,direction:'both'} call) vs src/http/routes-admin.js:2990-3012 (parses direction in/out/both, edge_types[], limit)
  • Why it matters: the parameters most prone to a wrong-direction or off-by-one bug — direction:'in'/'out', edge_types filtering, limit truncation — are never asserted.
  • Suggested fix: add neighbors calls with direction:'out' and an edge_types filter, asserting the reachable set differs as expected.

bin/admin.js graph command wrappers untested

  • Severity: minor
  • Confidence: 92
  • Evidence: bin/admin.js:21-33 (github-backfill / graph-project / graph-neighbors flag→body mapping); smoke calls the HTTP routes directly and never invokes the CLI
  • Why it matters: the arg-parsing glue (--edge-typeedge_types:[x], Number() on --depth/--limit, leading--- → no-node) is thin but entirely uncovered.
  • Suggested fix: a small unit test of the flag→body mapping, or accept as a documented known gap.

Multi-repo / org-enumeration / ignore-list capture untested

  • Severity: minor
  • Confidence: 90
  • Evidence: plugins/github/src/capture.js resolveRepos (repos ∪ org minus ignore, lowercased/deduped/sorted) and captureRepos per-repo error isolation; smoke uses one explicit repo, no orgs, no ignore
  • Why it matters: fleet selection and the guarantee that one failing repo doesn't abort the tick (errors[]) are load-bearing for real deployments and unexercised.
  • Suggested fix: a fixture with an org + an ignored repo plus an injected per-repo throw, asserting the resolved set and that the throw lands in errors[] without aborting.

graph/project --source filter and dry_run untested

  • Severity: minor
  • Confidence: 85
  • Evidence: src/daemon.js:166-167 (sourceDataset filter when source given) and the dry_run path in routes-admin.js; smoke always posts json:{}
  • Why it matters: source-scoping is what makes the github-only vs cross-source convergence distinction real, and dry_run (writes nothing) is the safe-preview contract — neither is asserted.
  • Suggested fix: one graph/project with {source:'github_events'} and one {dry_run:true} asserting nodesWritten === 0.

Reports: .git/dual-review/pr-6

@philcunliffe

Copy link
Copy Markdown
Contributor Author

🧭 Decision map — where to spend your attention

Companion to the dual-review verdict. This casts no verdict — it points at the 6 forks where the author made a real choice, so you can skim the rest.

Scanned: 40 hunks across 27 files (+2968 −48). Most is mechanical: ~14 new plugins/github/src/* files, the new llp/0010 doc, package-lock.json churn, and additive admin glue (bin/admin.js subcommands + routes-admin.js handlers forwarding to daemon services). The decisions worth your eyes, in order:

1. Served config-pin wire shape → artifact_hash + string source · Contract/shape

src/configs/save-pipeline.js:234

artifact_hash: fetched.contentHash,
  • Decision: the served config-pin now records version + artifact_hash + a string source, replacing content_hash + manifest_hash + an object source.
  • Alternative not taken: keep the dual-hash + {kind,...} object source — which the client's own shape parser (../hypaware .../config/schema.js) would now reject. This is a live cross-repo wire contract, not an internal rename.
  • Check: confirm every connected client version parses artifact_hash + string source; there's no compat shim, so a server/client skew breaks config-apply fleet-wide.

2. mover running boolean → opportunistic tick() vs guaranteed drain() · Concurrency/lifecycle

src/ingest/mover.js:73 · callers daemon.js:236 (shutdown), routes-admin.js:97 (run-now)

async drain() {  // await inflight, then run a guaranteed fresh pass
  • Decision: shutdown and admin "run now" call drain(); only the 200ms timer keeps opportunistic tick() (skip-if-inflight).
  • Alternative not taken: a single tick() everywhere (prior behavior — "run now" could silently no-op under a concurrent pass) or a real work queue.
  • Check: while (inflight) await inflight.catch(()=>{}) then a fresh pass — verify it can't miss a row spooled just before the call and can't hang shutdown. This is the previously-~50%-flaky path.

3. SELF_MANAGED_PLUGINS — by-plugin allowlist, not by-capability · Contract/policy

src/catalog/registry.js:24

const SELF_MANAGED_PLUGINS = new Set(['@hypaware/server', '@hypaware/context-graph', '@hypaware/github'])
  • Decision: only these three plugins' datasets keep their own createDataSource; everything else — notably @hypaware/ai-gateway, which also ships one — takes the date-partition synthesis path.
  • Alternative not taken: the obvious capability rule "any plugin exposing createDataSource manages itself", rejected because ai-gateway needs partition synthesis.
  • Check: this allowlist is load-bearing — a future dataset-shipping plugin not added here silently won't be queryable. Confirm the coupling is intended and discoverable.

4. Unknown neighbors seed → HTTP 400, not 200-empty · Unhappy-path / contract (test-pinned)

src/http/routes-admin.js:158

return sendJson(res, result && result.ok === false ? 400 : 200, result)
  • Decision: an unresolvable graph seed is a client error (400); smoke locks it ("neighbors 400s on an unknown seed").
  • Alternative not taken: 200 with an empty neighbor set — treat "no such node" as a valid empty traversal.
  • Check: confirm 400-on-unknown-seed is the contract admin clients/CLI should code against; it's now frozen by an assertion.

5. Comment type discriminated by this run's PR set · Algorithm / unhappy-path

plugins/github/src/capture.js:302

const onPr = prNumbers.has(number)
return base(onPr ? 'pull_request_comment' : 'issue_comment', `comment:${c.id}`, repo, {
  • Decision: a comment is classified PR-vs-issue solely by whether its subject number was fetched as a PR this run.
  • Alternative not taken: persist known PR numbers in the cursor, or classify against already-captured github_events — i.e. discriminate on all-time PRs, not this run's listing.
  • Check: on a poll where the pulls listing 304s, prNumbers is empty → a new comment on an existing PR becomes an issue_comment (wrong edge type). Dormant today (poll disabled); revisit before enabling poll.

6. Pagination cap 50 + truncate-and-warn · Magic value + unhappy-path policy

plugins/github/src/github_client.js:26

const MAX_PAGES = 50
  • Decision: cap any listing at 50 pages and emit github.listing_truncated rather than failing or paging unbounded.
  • Alternative not taken: page to exhaustion (unbounded memory/time on a huge repo) or hard-fail on overflow.
  • Check: 50 × PER_PAGE is the silent ceiling on one repo's capture; truncation only warns, so confirm it clears your largest repos.

Honorable mentions (real but lower-stakes): capture.js:92backfill resets each repo's cursor to {} (full re-fetch) vs incremental; config.jstoken_env defaults to GITHUB_TOKEN; graph_contract.jsprojectorVersion stamped on the contract to enable reprojection/invalidation.

Generated by /decision-map. Advisory — directs attention, casts no verdict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant