Vector Databases

Learning Objectives

By the end of this module you will understand:

In the previous module, we learned how to represent text as embeddings and measure similarity between vectors.

The next challenge is scaling retrieval when we have millions or billions of embeddings.

Vector databases are the solution.

Traditional databases (SQL, NoSQL) are designed for:

Example:


SELECT * FROM documents WHERE title = 'Return Policy';

This works for structured queries but fails for vector similarity search:

A vector database is a system specifically designed to:

It allows us to scale semantic search and RAG systems.

Core components of a vector database:

For example, given a query embedding:


Query: "How long is the return period?"

The database searches for the top K embeddings with highest cosine similarity.

Example result:

Doc A: Return policy (score: 0.92)
Doc B: Warranty policy (score: 0.68)
Doc C: Shipping info (score: 0.42)

The system returns Doc A, which is the most relevant.

Vector databases use special indexing structures to reduce computation:

HNSW (Hierarchical Navigable Small World): efficient ANN search for millions of vectors
IVF (Inverted File Index): partitions vector space into clusters
PQ (Product Quantization): compresses embeddings for memory efficiency

These allow fast and memory-efficient nearest neighbor search.

Several vector databases are widely used in AI applications:

Vector databases often store metadata alongside embeddings:

This allows hybrid search:

Example:


Retrieve the most relevant document embedding where type = 'policy'

In RAG systems, vector databases serve as the retrieval engine:

Vector DB + LLM = Retrieval-Augmented Generation

Traditional databases cannot efficiently handle embedding search
Vector databases store high-dimensional embeddings and perform nearest neighbor search
Indexing techniques like HNSW, IVF, PQ enable fast retrieval
Metadata allows hybrid search with filters
Vector databases are critical components of RAG systems

In the next module we will explore Document Chunking Strategies: