refactor(storage): replace batch (queue,state) index with active_batch membership#233
refactor(storage): replace batch (queue,state) index with active_batch membership#233albertywu wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors how “active batches” are queried in the storage layer by removing reliance on a (queue, state) secondary index and introducing an explicit active_batch(queue, batch_id) membership table, enabling queue-scoped listing via primary-key-prefix reads.
Changes:
- Replace
BatchStore.GetByQueueAndStates(queue, states)withBatchStore.ListActive(queue)and move state filtering into callers. - Add
active_batchmembership table; remove theidx_queue_stateindex from the MySQL schema; update MySQLBatchStoreto maintain/self-heal membership. - Extend integration/contract tests to cover
ListActivesemantics and MySQL-specific self-healing behavior.
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/integration/submitqueue/extension/storage/suite.go | Adds contract tests for BatchStore.ListActive and exposes storage via GetStorage() for backend-specific assertions. |
| test/integration/submitqueue/extension/storage/mysql/storage_test.go | Adds MySQL-specific tests validating self-healing behavior of active_batch rows. |
| test/integration/submitqueue/extension/storage/mysql/BUILD.bazel | Adds entity dependency required by new MySQL integration tests. |
| submitqueue/orchestrator/controller/cancel/cancel.go | Switches cancel controller from GetByQueueAndStates to ListActive. |
| submitqueue/orchestrator/controller/cancel/cancel_test.go | Updates mocks/expectations for ListActive. |
| submitqueue/orchestrator/controller/batch/batch.go | Uses ListActive and filters desired states in-memory for conflict analysis. |
| submitqueue/orchestrator/controller/batch/batch_test.go | Updates mocks/expectations and adds coverage for in-memory filtering behavior. |
| submitqueue/extension/storage/mysql/schema/README.md | Documents the new membership-table approach and maintenance model. |
| submitqueue/extension/storage/mysql/schema/batch.sql | Removes secondary index from batch table schema. |
| submitqueue/extension/storage/mysql/schema/active_batch.sql | Adds new active_batch membership table schema. |
| submitqueue/extension/storage/mysql/batch_store.go | Implements membership-first Create and ListActive resolution + best-effort terminal cleanup. |
| submitqueue/extension/storage/mock/batch_store_mock.go | Updates mock interface to ListActive. |
| submitqueue/extension/storage/batch_store.go | Updates storage interface to ListActive and documents new contract. |
Files not reviewed (1)
- submitqueue/extension/storage/mock/batch_store_mock.go: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ebba199 to
7c00f7b
Compare
…h membership Replace the secondary index over the batch table's mutable `state` column with an `active_batch` membership table that answers the only queue-scoped query the pipeline needs: "which batches in this queue are still active?" (the batch controller uses it to find conflict dependencies; the cancel controller uses it to find the batch holding a request). A row is intended to exist while its batch is non-terminal, so the table stays bounded by the live speculation window rather than growing with batch history. `queue` leads the PK so listing is a PK-prefix scan and the table is shardable by queue — an access pattern that ports cleanly to a key-value store (queue = partition key, batch_id = sort key), unlike a server-maintained secondary index over a mutable non-key column. Membership is best-effort, not an exact mirror of batch state, and is maintained without transactions: - Create writes the membership row before the batch row. This ordering is required for correctness: whenever a batch row is visible to a reader its membership row is already present, so a concurrent ListActive can never miss an active batch. INSERT IGNORE keeps the membership write idempotent across retries. - If the batch insert then fails, Create deliberately leaves the membership row in place. A returned error does not prove the row was not written (an ambiguous failure can commit the batch row and still return an error), so deleting would risk permanently orphaning a live, non-terminal batch from ListActive. A dangling membership is the safe direction. - ListActive resolves each member by primary key: a terminal batch's membership is best-effort removed (race-free — a terminal batch is fully committed and its id is never reused); a missing batch is skipped but NOT removed (it may belong to an in-flight Create that has written its membership but not yet its batch row). Cleanup failures are swallowed so a read never fails on index maintenance, and terminal-state writers (merge, speculate, dlq) need not touch the index. Genuinely dangling rows (failed/crashed creates) and batches stuck in a non-terminal state are left for a future reconcile/prune job, documented in schema/README.md. Integration tests cover the self-heal and membership invariants: - TestActiveBatch_SelfHealsTerminalMembership - TestActiveBatch_SkipsDanglingMembershipWithoutDeleting - TestActiveBatch_CreateKeepsMembershipOnDuplicate - TestActiveBatch_CreateKeepsMembershipOnFailedInsert Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7c00f7b to
49119c4
Compare
sbalabanov
left a comment
There was a problem hiding this comment.
I suggest to move the "active batches" logic to the app layer. Which means that a separate entity with its own data store has to be created.
| - **Write overhead**: Every `INSERT` and `UPDATE` to the `batch` table must also update the secondary index, adding latency to write operations. | ||
| - **Storage cost**: The index consumes additional disk space proportional to the number of rows in the table. | ||
| - **Lock contention**: Under high write concurrency, index maintenance can increase lock contention on the affected index pages. | ||
| `queue` leads the composite primary key `(queue, batch_id)`, so listing a queue's active batches is a primary-key-prefix scan and the table is shardable by queue. On a key-value store the same shape maps directly onto a partition key (`queue`) and sort key (`batch_id`) with no secondary index. |
There was a problem hiding this comment.
Typical key-value stores do not support querying by partition key alone.
Possible solutions:
- make queue PK and list of ids as value. Cons: high conflict rate for updates
- require datastore to support ranged queries
| } | ||
| return nil, fmt.Errorf("failed to get active batch id=%q queue=%q: %w", id, queue, err) | ||
| } | ||
| if batch.State.IsTerminal() { |
There was a problem hiding this comment.
storage layer should be kept simple and not enforce the logic of the app.
| // | ||
| // Membership is tracked in active_batch (queue leads the PK), so listing is a | ||
| // PK-prefix scan that ports cleanly to a key-value store. Each member is fetched | ||
| // by primary key: a terminal batch's membership is best-effort removed (race-free, |
There was a problem hiding this comment.
this is not fully correct: list of batches in activeBatchIDs is not fetched by a primary key
Summary
Get rid of the index on the batch table, to conform with repository convention.
The orchestrator needs knowledge of which batches are currently active, which was the primary reason this index existed -- for quick looking of active batches.
Instead of relying on indices for quick lookup of active batches, create a secondary table instead
active_batchesthat is populated when batches are created and removed during orchestration process.Test Plan
✅
make fmt && make build && make test && make e2e-test