Skip to content

GitHub Copilot Claude model limits are inaccurate — missing max_prompt_tokens and wrong context values #858

@maximharizanov

Description

@maximharizanov

Problem

The github-copilot provider entries for Claude models have inaccurate limit values. The data currently reports a single context value that doesn't reflect the actual limits exposed by the Copilot API (CAPI).

Source of truth

The authoritative source for Copilot model limits is the CAPI endpoint GET https://api.githubcopilot.com/models. The response includes three distinct token limits per model:

"family": "claude-opus-4.6",
"limits": {
    "max_context_window_tokens": 144000,
    "max_output_tokens": 64000,
    "max_prompt_tokens": 128000,
    "max_non_streaming_output_tokens": 16000
}

Current models.dev values vs CAPI (Claude models)

Model models.dev context models.dev output CAPI max_context_window_tokens CAPI max_prompt_tokens CAPI max_output_tokens
claude-opus-4.6 128,000 64,000 144,000 128,000 64,000
claude-sonnet-4.5 128,000 16,000 144,000 128,000 32,000
claude-opus-4.5 128,000 16,000 160,000 128,000 32,000
claude-sonnet-4 128,000 16,000 216,000 128,000 16,000
claude-haiku-4.5 ? ? 144,000 128,000 32,000

Every Claude model has discrepancies. The pattern is:

  1. context is underreported — models.dev uses 128k, but CAPI reports 144k–216k for max_context_window_tokens
  2. output is underreported for sonnet-4.5 and opus-4.5 (16k vs 32k actual)
  3. max_prompt_tokens (input limit) is missing entirely — CAPI provides a separate max_prompt_tokens field (128k for all Claude models), which is different from max_context_window_tokens

Impact

This data is consumed by OpenCode (via https://models.dev/api.json) to determine compaction thresholds. The compaction logic in OpenCode uses:

const usable = model.limit.input 
    ? model.limit.input - reserved 
    : context - reserved

Since limit.input is not set (because models.dev doesn't populate it), OpenCode falls back to context - reserved, using the underreported 128k value. This triggers compaction earlier than necessary, wasting ~16k tokens of usable context on every session.

Suggested fix

For github-copilot Claude models, update the limits to match CAPI. The models.dev schema already supports limit.input — it just needs to be populated:

"claude-opus-4.6": {
    "limit": {
        "context": 144000,
        "input": 128000,
        "output": 64000
    }
}

How to verify

Query the CAPI models endpoint with a valid Copilot token:

curl -s "https://api.githubcopilot.com/models" \
  -H "Authorization: Bearer $COPILOT_TOKEN" \
  -H "Copilot-Integration-Id: vscode-chat"

This returns the full model catalog with authoritative limits per model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions