Technical AI
What Is an AI Context Window?
Why the context window is the most important number you're not paying attention to — and how to use it strategically.
The short version
The context window is the maximum amount of text a model can process in a single interaction — input plus output combined. Everything outside this window is invisible to the model. It has no memory of previous conversations, no access to documents you haven't provided, no awareness of anything beyond its current context.
Context windows are measured in tokens, not words or characters. A token is typically 3-4 characters of English text — "transformer" is 2-3 tokens, "the" is 1 token, punctuation is usually 1 token. Context size is ultimately constrained by the transformer architecture underneath. APIs charge per token, and context size directly determines what tasks are feasible.
Model comparison
Context windows have grown dramatically since GPT-3's 4K limit in 2020. For a broader head-to-head, see ChatGPT vs Claude vs Gemini. Here's where leading models sit in 2026.
Important caveat: having a 1M token context window doesn't mean you should fill it. Cost scales linearly with tokens. A 200K-token prompt with Gemini costs 200x more than a 1K-token prompt. And the "lost in the middle" problem means retrieval quality degrades when context is enormous — which is why production systems lean on retrieval-augmented generation to pull in only the relevant chunks.
What fits
Real-world equivalents to help you plan what you can actually load into a context window.
Using context well
FAQ
Sources
04 — Don't watch from the outside
the curve
Weekly briefings on AI tools, adoption trends, and what actually matters for practitioners. No hype. Just signal. Join readers navigating the shift.