Skip to content

fix: make the smoke test reliably green#5

Open
philcunliffe wants to merge 1 commit into
mainfrom
fix/flaky-smoke-test
Open

fix: make the smoke test reliably green#5
philcunliffe wants to merge 1 commit into
mainfrom
fix/flaky-smoke-test

Conversation

@philcunliffe

Copy link
Copy Markdown
Contributor

Summary

The end-to-end smoke test (npm run smoke) was failing — partly flaky, partly outright broken. It exits on the first failed check(), so each cause masked the next. This fixes all three. Verified 50/50 runs green (previously ~50% flaky).

Root causes & fixes

1. Mover "run now" could silently no-op — the real ~50% flake

mover.tick() had a running boolean guard that made a concurrent call return 0 immediately. The 200ms background timer and the admin POST /v1/admin/mover/run endpoint both call it, so when the test's runMover() landed while a background tick was mid-pass, the endpoint returned 200 without committing the just-spooled row — and the follow-up query saw 0 rows (Cannot read properties of undefined (reading 'client_received_at')).

  • Split into opportunistic tick() (the timer keeps skipping when busy, never piles up) and guaranteed drain() (waits out any in-flight pass, then runs a fresh pass whose spool.pending() snapshot is guaranteed to include rows spooled just before the call).
  • POST /v1/admin/mover/run and the shutdown drain now use drain().

2. Config pin format diverged from the kernel's config wire schema

The save pipeline emitted a lock-file-shaped plugin entry into a config document — source as an object spec ({kind,raw,path}) and the hash under content_hash. But the kernel's parseConfigShape requires source to be a string, and the client verifies the pin under artifact_hash (hypaware config/apply_deps.js). The server was producing a document its own shape parser rejects on re-submission (the pre-pinned document accepted without re-resolution check).

  • Emit version + artifact_hash; keep source as the operator's raw string the client re-resolves.
  • validatePrePinned now validates that shape.

3. Smoke proxy row predated the ai_gateway_messages schema

That dataset gained a required non-null session_id column (schema v6, kernel LLP 0030) → 422 missing_required_column. Added session_id to the proxy row.

Note: causes (2) and (3) are kernel-schema drift — hypaware is a live file:../hypaware sibling dependency whose shared schemas advanced after this test was written.

Test plan

  • npm run smokeall 80 checks passed
  • Ran the suite 50× consecutively with 0 failures (was ~50% before).

🤖 Generated with Claude Code

The end-to-end smoke test was failing — partly flaky, partly broken —
from three independent causes. It exits on the first failed check, so
each masked the next; this fixes all three.

1. Mover "run now" could silently no-op (the real ~50% flake).
   mover.tick() had a `running` guard that made a *concurrent* call
   return 0 immediately. The 200ms background timer and the admin
   /v1/admin/mover/run endpoint both call it, so when runMover() landed
   mid-pass it returned 200 without committing the just-spooled row, and
   the follow-up query saw 0 rows. Split into opportunistic tick() (the
   timer keeps skipping, never piles up) and guaranteed drain() (waits
   out any in-flight pass, then runs a fresh pass whose pending()
   snapshot is guaranteed to include rows spooled just before the call).
   The admin endpoint and the shutdown drain now use drain().

2. Config pin format diverged from the kernel's config wire schema.
   The save pipeline emitted a lock-file-shaped plugin entry into a
   config document — object `source` ({kind,raw,path}) and `content_hash`
   — but the kernel's parseConfigShape requires `source` to be a string
   and the client verifies the pin under `artifact_hash` (hypaware
   config/apply_deps.js). The server thus produced a document its own
   shape parser rejects on re-submission. Emit `version` +
   `artifact_hash`, keep `source` as the operator's raw string the client
   re-resolves; validatePrePinned validates that shape.

3. Smoke proxy row predated the ai_gateway_messages schema. That dataset
   gained a required non-null `session_id` column (schema v6); add it.

Verified: 50/50 runs green (was ~50% flaky).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant