Back to Glossary

Context Window

Models & Architectures

Maximum number of tokens a model can process at once.


The context window defines how many tokens from the input or conversation history a model can attend to.

  • Effects: Determines memory span and output quality in long tasks.
  • Trade-offs: Memory/compute vs. range; long-context methods include sliding windows and sparse attention.
  • Practice: Chunking, retrieval (RAG), summarization, structured prompts.