Vector databases explained for non-engineers.
"We're using a vector database for the knowledge base" — if you've heard this in an AI project and nodded politely, this post is for you. Vector databases are the storage layer behind most AI knowledge systems, and understanding what they do is essential for evaluating whether your AI system is built correctly.
Most explanations of vector databases start with the math. We're going to start with the problem they solve, because that's where the intuition lives.
The problem: finding things by meaning, not by keywords
Imagine you have 10,000 internal documents and you want to find the ones relevant to the question: "What's our policy on data retention for European clients?"
A traditional keyword search finds documents that contain the words "data retention" and "European." This works for simple queries but fails for nuanced ones. What if the relevant policy document uses the phrase "GDPR compliance obligations" instead of "European clients"? Or "data storage duration" instead of "data retention"? Keyword search misses those.
What you actually want is a search that finds documents with the same meaning as your query, regardless of whether they use the same words. This is semantic search, and it requires a fundamentally different kind of database.
What an embedding actually is
The key concept behind vector databases is the embedding — a mathematical representation of meaning.
An embedding is a list of numbers (typically 768 to 3,072 numbers) that represents the meaning of a piece of text. An embedding model (a type of AI model trained specifically for this purpose) converts text into these numbers. The critical property: text with similar meaning produces similar numbers.
Here's the intuition: imagine plotting every piece of text in a space where distance represents meaning similarity. "GDPR compliance obligations" and "European data regulations" would plot close together. "Monthly revenue report" would plot far away from both. The embedding is the address of a piece of text in that meaning space.
When you ask "What's our policy on data retention for European clients?" the embedding model converts that question into a set of numbers — its address in meaning space. A vector database finds the documents whose addresses are closest. Those are the semantically similar documents.
How a vector database works in practice
In a RAG system (which is the most common use case), the vector database is used in two phases:
Ingestion phase (one-time and ongoing):
- Your documents are split into chunks (e.g., 400–800 words each)
- Each chunk is sent to an embedding model, which returns a vector (list of numbers)
- The vector, plus the original text and any metadata (source document, date, etc.), is stored in the vector database
Query phase (every time a question is asked):
- The user's question is sent to the same embedding model, producing a query vector
- The vector database finds the stored vectors closest to the query vector (nearest neighbors)
- The original text of those nearest chunks is returned — this is the retrieved context
- The context plus the question is sent to the language model, which produces an answer
The entire query phase takes milliseconds at scale because vector databases are optimized specifically for nearest-neighbor search.
The major vector database options
The choice of vector database matters less than people think for most applications, but here's a practical guide:
pgvector: an extension for PostgreSQL that adds vector search capability. If you're already using PostgreSQL, this is often the right starting point — zero additional infrastructure, good performance up to a few million vectors, and you can combine vector search with regular SQL queries in the same query. We default to pgvector for most early-stage applications.
Pinecone: managed cloud vector database. The easiest to get started with — no server management, simple API, scales automatically. Costs money and is not open source. Good choice when operational simplicity matters and cost isn't a primary constraint.
Chroma: lightweight, open-source, easy to run locally. Great for development and prototyping. For production at significant scale, you'll probably move to something more robust.
Weaviate: open-source with a managed option. Strong feature set including hybrid search (combining vector search with keyword search), good for enterprise deployments where filtering and metadata management are complex.
Qdrant: open-source with a managed option. Strong performance, good filtering capabilities. A good choice when you want open-source with production-grade reliability.
What hybrid search is and when you need it
Pure vector search finds semantically similar content. But sometimes you need to combine semantic similarity with keyword precision — especially for technical content, product names, IDs, and other cases where exact matches matter.
Hybrid search combines vector search with keyword search (usually BM25, a statistical relevance algorithm). The results from both approaches are merged and re-ranked. This typically outperforms pure vector search for mixed knowledge bases that contain both narrative content (where meaning matters) and technical content (where exact terms matter).
If your knowledge base contains product codes, policy IDs, or specific technical terminology that must be matched exactly, plan for hybrid search from the start.
Common misconceptions
"We need a vector database to do AI." No — many AI applications don't need one. You only need a vector database when you're doing semantic search over a knowledge base. Simple AI applications that don't need to retrieve information from documents don't need one.
"A bigger vector database is better." Quality beats quantity. A well-curated knowledge base of 1,000 high-quality documents beats a poorly organized collection of 50,000 mediocre ones. The retrieval will find the most similar chunks — if those chunks are poor quality, the AI's answers will be poor quality.
"Once it's indexed, I don't need to think about it." Vector databases require ongoing maintenance: keeping the knowledge base current as documents change, removing outdated information, monitoring retrieval quality, and adjusting chunking strategies when retrieval performance degrades.
If you're designing an AI system that involves a knowledge base and aren't sure whether you need a vector database or what type to use, ask us. It's usually a quick conversation.
Book a call