Context window
tokensMaximum tokens the model can hold in context — input + output combined
What is context window?
The context window is the maximum number of tokens an LLM can process in a single call — input and output combined. Think of it as the model's working memory: everything it can see at once. Larger desk, more context, more complex tasks possible.
Why it matters
Context window size determines what you can accomplish in a single API call. A small context window forces you to chunk documents, summarise earlier turns, or use retrieval systems. A large context window lets you process entire codebases or full contracts in one pass. For developers building applications, this is a hard engineering constraint that shapes architecture decisions every day.
Where models stand
- 2,000,000 tokens
- 1,000,000 tokens
- 1,000,000 tokens
- #4 o1200,000 tokens
- 200,000 tokens
Data available for 30 of 30 tracked models. Last updated 2026-03-24.
How sourc.dev tracks this
sourc.dev tracks context window through its automated monitoring pipeline. Data is collected on a regular schedule, compared against previous values, and any changes are recorded in the history table with full provenance — source URL, effective date, and verification timestamp. Nothing is overwritten. The pipeline ensures this attribute stays current without manual intervention.
Related
Frequently asked questions
What happens when you exceed the context window?
The API will either return an error or silently truncate the oldest tokens. The exact behaviour depends on the provider. OpenAI returns a 400 error. Anthropic returns a similar error. If you are building an application, manage token counts proactively.
Does a larger context window mean better quality?
Not necessarily. Research has shown a "lost in the middle" effect — models attend well to the beginning and end of long inputs but less to content in the middle. A 200K context window may not utilise all tokens with equal quality.
How are context windows measured?
In tokens. A token is roughly 3-4 characters in English, or about 0.75 words. Different models use different tokenisers, so the same text may produce different token counts. Most providers offer tokeniser libraries for pre-counting.