Vector Databases

Learning Objectives

By the end of this module you will understand:

In the previous module, we learned how to represent text as embeddings and measure similarity between vectors.

The next challenge is scaling retrieval when we have millions or billions of embeddings.

Vector databases are the solution.


1. Why Traditional Databases Fail

Traditional databases (SQL, NoSQL) are designed for:

Example:


SELECT * FROM documents WHERE title = 'Return Policy';

This works for structured queries but fails for vector similarity search:


2. What is a Vector Database?

A vector database is a system specifically designed to:

It allows us to scale semantic search and RAG systems.


3. How Vector Databases Work

Core components of a vector database:

  1. Storage: stores embeddings in high-dimensional space
  2. Indexing: builds structures (like HNSW, IVF, PQ) for fast search
  3. Search API: accepts query embedding and returns top K nearest neighbors
  4. Metadata storage: optional extra information (document ID, text, source)

4. Nearest Neighbor Search in Practice

For example, given a query embedding:


Query: "How long is the return period?"

The database searches for the top K embeddings with highest cosine similarity.

Example result:


Doc A: Return policy (score: 0.92)
Doc B: Warranty policy (score: 0.68)
Doc C: Shipping info (score: 0.42)

The system returns Doc A, which is the most relevant.


5. Indexing Techniques

Vector databases use special indexing structures to reduce computation:

These allow fast and memory-efficient nearest neighbor search.


6. Popular Vector Databases

Several vector databases are widely used in AI applications:


7. Metadata and Hybrid Search

Vector databases often store metadata alongside embeddings:

This allows hybrid search:

Example:


Retrieve the most relevant document embedding where type = 'policy'


8. Integration with RAG Pipelines

In RAG systems, vector databases serve as the retrieval engine:

  1. Embed documents → store embeddings in vector DB
  2. Embed query → perform nearest neighbor search
  3. Retrieve top K documents → send to language model
  4. Generate answer → language model interprets retrieved context

Vector DB + LLM = Retrieval-Augmented Generation


9. Benefits of Vector Databases


Key Takeaways


Next Module

In the next module we will explore Document Chunking Strategies:

💬
AI Learning Assistant