feat(providers): hosted-key support for LLM providers (flag-gated, no rate limiting)#5127
feat(providers): hosted-key support for LLM providers (flag-gated, no rate limiting)#5127TheodoreSpeaks wants to merge 8 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryHigh Risk Overview Key resolution in Billing and metrics: non-streaming responses bill when a platform hosted key was used (not only legacy hosted-model lists), apply the cost multiplier, and call Shared helpers in Reviewed by Cursor Bugbot for commit 4d27d1b. Bugbot is set up for automated code reviews on this repo. Configure here. |
Greptile SummaryThis PR routes LLM provider calls through the shared hosted-key framework (BYOK-first → platform key pool via
Confidence Score: 4/5The non-streaming and simple-streaming paths are correctly wired end-to-end; tool-path streaming for most providers still lacks the finalizeTiming call that would settle the hosted-key cost metric and apply the cost multiplier on drain. The flag-gated key resolution, rate-limiter mode:'none', and centralized cost settlement in settleStreamingLlmCost are all sound. The one structural gap — tool-path streaming providers omitting finalizeTiming in their createStream callbacks — was flagged in a prior review cycle and remains unaddressed. apps/sim/providers/streaming-execution.ts and the tool-path createStream callbacks in openai/core.ts, baseten/index.ts, fireworks/index.ts, together/index.ts, mistral/index.ts, and ollama/core.ts Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant E as executeProviderRequest
participant B as getApiKeyWithBYOK
participant RL as HostedKeyRateLimiter
participant P as Provider.executeRequest
participant M as hostedKeyMetrics
E->>B: provider, model, workspaceId, apiKey, userId
B->>B: isHosted and workspaceId and flag on?
alt BYOK workspace key exists
B-->>E: apiKey isBYOK true
else platform key available
B->>RL: acquireKey mode none
RL-->>B: success key envVarName
B-->>E: apiKey isBYOK false hostedKeyEnvVar
else fallback
B-->>E: legacy key no hostedKeyEnvVar
end
E->>P: executeRequest sanitizedRequest
alt throws
E->>M: recordFailed
E-->>E: re-throw
else StreamingExecution
E->>M: recordUsed
Note over P,M: On stream drain
P->>M: recordCostCharged via settleStreamingLlmCost
else ProviderResponse
E->>M: emitHostedKeyUsage
end
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant E as executeProviderRequest
participant B as getApiKeyWithBYOK
participant RL as HostedKeyRateLimiter
participant P as Provider.executeRequest
participant M as hostedKeyMetrics
E->>B: provider, model, workspaceId, apiKey, userId
B->>B: isHosted and workspaceId and flag on?
alt BYOK workspace key exists
B-->>E: apiKey isBYOK true
else platform key available
B->>RL: acquireKey mode none
RL-->>B: success key envVarName
B-->>E: apiKey isBYOK false hostedKeyEnvVar
else fallback
B-->>E: legacy key no hostedKeyEnvVar
end
E->>P: executeRequest sanitizedRequest
alt throws
E->>M: recordFailed
E-->>E: re-throw
else StreamingExecution
E->>M: recordUsed
Note over P,M: On stream drain
P->>M: recordCostCharged via settleStreamingLlmCost
else ProviderResponse
E->>M: emitHostedKeyUsage
end
Reviews (3): Last reviewed commit: "Merge remote-tracking branch 'origin/sta..." | Re-trigger Greptile |
| } else if (hostedKeyEnvVar) { | ||
| // Hosted key used: record usage now; cost is settled on stream drain | ||
| // inside createStreamingExecution (the single streaming cost seam). | ||
| hostedKeyMetrics.recordUsed({ |
There was a problem hiding this comment.
Inaccurate comment — Gemini does not use
createStreamingExecution
The comment says "cost is settled on stream drain inside createStreamingExecution (the single streaming cost seam)", but Gemini uses a bespoke createStreamingResult helper and calls settleStreamingLlmCost directly inside its own stream callback. createStreamingExecution is not involved in the Gemini path at all. The comment should note that Gemini settles cost in its own drain callback (or reference both seams).
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
@greptile review |
…-agent # Conflicts: # apps/sim/lib/api-key/byok.ts # apps/sim/lib/core/config/feature-flags.ts
|
@greptile review |
|
Addressed review:
Also fixed CI: completed provider test mocks ( |
| tool: response.model, | ||
| key: hostedKeyEnvVar as string, | ||
| costTotal: response.cost.total, | ||
| }) |
There was a problem hiding this comment.
Hosted metrics omit tool costs
Medium Severity
For non-streaming hosted-key responses, emitHostedKeyUsage runs right after token calculateCost, but sumToolCosts is applied to response.cost.total only later. CloudWatch hosted-key cost therefore excludes tool charges even when the returned response total includes them.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 5f9046e. Configure here.
|
Two more Bugbot findings reviewed:
|
|
Addressed the two streaming-metric symmetry findings (
Full suite green (8870 tests), tsc 0, lint 16/16. |
…t the chokepoint (covers gemini)
|
Addressed the gemini failure-metric finding ( Moved hosted-stream failure recording to the provider-agnostic chokepoint — This generalizes the previous per-provider failure handling, so it won't recur for other stream shapes. |
…ed-key cost/metrics
|
Fixed the High ( |
|
bugbot run |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
1 issue from previous review remains unresolved.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 4d27d1b. Configure here.


Summary
hosted-key-llmflag (default off)mode: 'none'so LLMs acquire platform keys with no queue/rate limitingcreateStreamingExecution(sharedsettleStreamingLlmCost), applying the cost multiplier the per-provider streaming paths previously omitted; gemini uses the same helperhostingconfig (envKeyPrefix+byokProviderId); pricing stays per-model viacalculateCostclassifyHostedKeyFailureacross tools/providers (fixes auth-error misclassification)Type of Change
Testing
Tested manually;
bun run lintclean,tsc --noEmit0 errors,check:api-validation:strictpassed, unit tests passing (hosted-cost, byok, rate-limiter, streaming-execution, tools)Checklist