Cosine Similarity
A metric that measures the angle between two vectors to determine how semantically similar two texts are.
Cosine similarity computes the cosine of the angle between two embedding vectors, producing a score between -1 and 1. A score near 1 means the texts are semantically similar; near 0 means unrelated. It is the standard distance metric for vector search.
Unlike Euclidean distance, cosine similarity is magnitude-invariant, meaning it focuses on direction rather than vector length. This makes it robust across embeddings of different text lengths and is why it is the default similarity function in most vector databases.
Related Terms
More ai/ml Terms
Retrieval-Augmented Generation (RAG)
An AI architecture that combines information retrieval with text generation to produce answers grounded in source documents.
Vector Embedding
A numerical representation of text as a high-dimensional vector, enabling semantic similarity comparisons between passages.
BM25
A probabilistic keyword-ranking algorithm that scores documents by term frequency and inverse document frequency.
Chunking
The process of splitting large documents into smaller, overlapping segments optimized for retrieval and embedding.
Hallucination
When an AI model generates plausible-sounding but factually incorrect or fabricated information.
Large Language Model (LLM)
A neural network trained on massive text corpora that can understand and generate human language.
Analyze Documents Related to Cosine Similarity
Upload any document and get AI-powered analysis with verifiable citations.
Start Free