Context Window
The maximum number of tokens a language model can process in a single request, including both input and output.
Context windows range from 4K tokens in older models to over 1M tokens in newer ones. The window must accommodate the system prompt, retrieved passages, conversation history, and the generated response. Exceeding the limit causes truncation or errors.
For document intelligence, context window size determines how many retrieved chunks can be included in a single query. Larger windows allow more evidence to be presented to the model, but cost scales linearly with token usage. Effective chunking and retrieval reduce the need for massive context windows.
More ai/ml Terms
Retrieval-Augmented Generation (RAG)
An AI architecture that combines information retrieval with text generation to produce answers grounded in source documents.
Vector Embedding
A numerical representation of text as a high-dimensional vector, enabling semantic similarity comparisons between passages.
BM25
A probabilistic keyword-ranking algorithm that scores documents by term frequency and inverse document frequency.
Chunking
The process of splitting large documents into smaller, overlapping segments optimized for retrieval and embedding.
Hallucination
When an AI model generates plausible-sounding but factually incorrect or fabricated information.
Large Language Model (LLM)
A neural network trained on massive text corpora that can understand and generate human language.
Analyze Documents Related to Context Window
Upload any document and get AI-powered analysis with verifiable citations.
Start Free