Skip to content

fix(docs-vec): make the hybrid driver buildable and runnable on stock PHP#130

Merged
markshust merged 1 commit into
developfrom
feature/docs-vec-driver
Jun 24, 2026
Merged

fix(docs-vec): make the hybrid driver buildable and runnable on stock PHP#130
markshust merged 1 commit into
developfrom
feature/docs-vec-driver

Conversation

@markshust

Copy link
Copy Markdown
Collaborator

Summary

The docs-vec audit (#128) flagged six defects plus a presumed structural wall — "PDO can't load SQLite extensions on stock PHP." The wall was a wrong API call, not a platform limit: Pdo\Sqlite::loadExtension() works on stock Homebrew PHP 8.5. Every defect turned out to be code/packaging, and the driver is now functional end-to-end.

Fixes (closes #128)

  1. Double-binding boot crashmodule.php bound DocsSearchInterface in both bindings and singletonsBindingConflictException. Now bindings => [] + singleton factory.
  2. Unautowireable $packageRootVecRuntime + the download commands now registered as singletons with __DIR__.
  3. 128 MB OOMDownloadModelCommand streams the ONNX model to disk (stream_copy_to_stream).
  4. Missing sqlite-vec binary — new docs-vec:download-extension command: pinned v0.1.9, SHA-256-verified before extraction, fail-closed, per-platform (macOS/Linux/Windows × x86_64/arm64).
  5. Wrong/suggest-only transformerscodewithkyrian/transformers-phpcodewithkyrian/transformers (^0.5 || ^0.6); kept optional with graceful FTS5-only fallback when extension/model/transformers are absent.
  6. Structural (decisive)VecRuntime now loads sqlite-vec via Pdo\Sqlite::loadExtension() instead of the blocked SELECT load_extension(...), with a probe that degrades to FTS5 on builds that compile the capability out.

Plus latent bugs the never-run embedding path hid: HF model layout (onnx/ subdir + tokenizer files), pipeline() usage + mean-pooling, per-call model reload (the indexing OOM), vec0 integer-PK bind. The FTS half now sanitizes NL queries (parity with #127), co-located because marko/docs is contract-only.

Verification

  • Real hybrid build over the docs corpus → all 8 expected docs in top-3 (V = 8/8), matching docs-fts.
  • 89 passed / 6 skipped / 0 failed across docs, docs-fts, docs-vec. phpcs + php-cs-fixer clean.
  • Downloaded binary/model are gitignored; root composer.json untouched.

docs-fts remains the recommended zero-infra default; docs-vec is now a working opt-in semantic option.

🤖 Generated with Claude Code

… PHP

The docs-vec audit had flagged six defects and a presumed structural wall
(PDO cannot load extensions). The wall was a wrong API call, not a platform
limit: Pdo\Sqlite::loadExtension() works on stock PHP 8.5. Every defect was
code/packaging.

Fixes:
- module.php: stop double-binding DocsSearchInterface (bindings => []) and
  register VecRuntime + the download commands as singletons with $packageRoot
- VecRuntime: load sqlite-vec via Pdo\Sqlite::loadExtension() (not SELECT
  load_extension), probe extension-loading support, memoize the embedding
  pipeline (fixes the indexing OOM)
- new docs-vec:download-extension command: pinned sqlite-vec v0.1.9,
  SHA-256-verified per platform (macOS/Linux/Windows × x86_64/arm64), streamed
- DownloadModelCommand: stream the ONNX model to disk (no 128MB OOM)
- graceful FTS5-only fallback when the extension/model/transformers are absent
- composer suggest: codewithkyrian/transformers-php -> codewithkyrian/transformers
  (^0.5 || ^0.6); fix HF model layout + transformers pipeline usage
- sanitize the FTS half's NL queries (parity with docs-fts #127), co-located
  because marko/docs is contract-only
- downloaded binary/model are gitignored; docs updated (drivers, package page, README)

Verified: real hybrid build over the docs corpus returns all 8 expected docs in
top-3 (V = 8/8), matching docs-fts. 89 passed / 6 skipped / 0 failed across
docs, docs-fts, docs-vec. docs-fts remains the recommended zero-infra default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the bug Something isn't working label Jun 24, 2026
@markshust markshust merged commit c295853 into develop Jun 24, 2026
2 checks passed
@markshust markshust deleted the feature/docs-vec-driver branch June 24, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docs-vec: un-buildable / non-functional on standard PHP PDO-SQLite (6 stacked defects)

1 participant