Skip to content

fix(compiler): don't cap concepts-plan at max_tokens=2048 (empty plans on reasoning models)#90

Open
cnndabbler wants to merge 1 commit into
VectifyAI:mainfrom
cnndabbler:fix/concepts-plan-token-cap
Open

fix(compiler): don't cap concepts-plan at max_tokens=2048 (empty plans on reasoning models)#90
cnndabbler wants to merge 1 commit into
VectifyAI:mainfrom
cnndabbler:fix/concepts-plan-token-cap

Conversation

@cnndabbler

Copy link
Copy Markdown

Problem

The concepts-plan LLM call is hard-capped at max_tokens=2048:

], "concepts-plan", max_tokens=2048, response_format=_JSON_RESPONSE_FORMAT)

With reasoning / thinking models (e.g. ollama/qwen3.6:27b, deepseek reasoners), the model can spend the entire 2048-token budget on reasoning before emitting any JSON, so the response comes back empty. The plan then fails to parse and zero concept pages are generated — silently, while the doc is still marked [OK]:

openkb.agent.compiler WARNING: Failed to parse concepts plan: Expecting value: line 1 column 1 (char 0). Raw output (first 500 chars): ''
[WARN] concepts plan unparseable for <doc> — no concept pages generated.

This is the failure in #80 (model: ollama/qwen3.6:27b). Observed hitting ~50% of docs with a local thinking model before the cap was lifted.

Fix

Remove the max_tokens=2048 cap so concepts-plan is uncapped, matching the existing uncapped summary call in the same pipeline. Content-rich plans (and reasoning-model preludes) then complete instead of truncating to empty.

Note / alternative

If a hard ceiling is preferred for cost control, an alternative would be to make it configurable (e.g. a concepts_plan_max_tokens config key, default high or None) rather than a fixed 2048. Happy to adjust to whichever you prefer — but the current fixed 2048 is too low for reasoning models and fails closed (empty output → no concepts).

Fixes #80.

Reasoning/thinking models can exhaust a 2048-token budget before emitting
the JSON plan, yielding empty output -> unparseable plan -> zero concept
pages (silently). Remove the cap so concepts-plan matches the uncapped
summary call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concept generation fails

1 participant