-
Notifications
You must be signed in to change notification settings - Fork 517
Description
Problem
The github-copilot provider entries for Claude models have inaccurate limit values. The data currently reports a single context value that doesn't reflect the actual limits exposed by the Copilot API (CAPI).
Source of truth
The authoritative source for Copilot model limits is the CAPI endpoint GET https://api.githubcopilot.com/models. The response includes three distinct token limits per model:
"family": "claude-opus-4.6",
"limits": {
"max_context_window_tokens": 144000,
"max_output_tokens": 64000,
"max_prompt_tokens": 128000,
"max_non_streaming_output_tokens": 16000
}Current models.dev values vs CAPI (Claude models)
| Model | models.dev context |
models.dev output |
CAPI max_context_window_tokens |
CAPI max_prompt_tokens |
CAPI max_output_tokens |
|---|---|---|---|---|---|
| claude-opus-4.6 | 128,000 | 64,000 | 144,000 | 128,000 | 64,000 |
| claude-sonnet-4.5 | 128,000 | 16,000 | 144,000 | 128,000 | 32,000 |
| claude-opus-4.5 | 128,000 | 16,000 | 160,000 | 128,000 | 32,000 |
| claude-sonnet-4 | 128,000 | 16,000 | 216,000 | 128,000 | 16,000 |
| claude-haiku-4.5 | ? | ? | 144,000 | 128,000 | 32,000 |
Every Claude model has discrepancies. The pattern is:
contextis underreported — models.dev uses 128k, but CAPI reports 144k–216k formax_context_window_tokensoutputis underreported for sonnet-4.5 and opus-4.5 (16k vs 32k actual)max_prompt_tokens(input limit) is missing entirely — CAPI provides a separatemax_prompt_tokensfield (128k for all Claude models), which is different frommax_context_window_tokens
Impact
This data is consumed by OpenCode (via https://models.dev/api.json) to determine compaction thresholds. The compaction logic in OpenCode uses:
const usable = model.limit.input
? model.limit.input - reserved
: context - reservedSince limit.input is not set (because models.dev doesn't populate it), OpenCode falls back to context - reserved, using the underreported 128k value. This triggers compaction earlier than necessary, wasting ~16k tokens of usable context on every session.
Suggested fix
For github-copilot Claude models, update the limits to match CAPI. The models.dev schema already supports limit.input — it just needs to be populated:
"claude-opus-4.6": {
"limit": {
"context": 144000,
"input": 128000,
"output": 64000
}
}How to verify
Query the CAPI models endpoint with a valid Copilot token:
curl -s "https://api.githubcopilot.com/models" \
-H "Authorization: Bearer $COPILOT_TOKEN" \
-H "Copilot-Integration-Id: vscode-chat"This returns the full model catalog with authoritative limits per model.