What is a vector database, and do you actually need one?

A vector database is a system optimized for storing and searching high-dimensional vectors — the embeddings produced by AI models. It's the storage layer of any RAG, semantic search, or recommendation system. The hype around "you need a vector DB" is mostly justified, but the choice between a dedicated vector DB and just using Postgres with pgvector is more nuanced than the marketing implies.

What a vector DB actually does

The core job: given millions of vectors, find the K nearest to a query vector — fast.

Brute force is O(n) — compare the query to every stored vector. Fine at 10,000 records (milliseconds). Painful at 1,000,000 (seconds). Impossible at 100,000,000.

Vector DBs use approximate nearest neighbor (ANN) indexes — typically HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index) — to get sub-linear search. Trade some accuracy (you might miss the truly nearest neighbor 1-5% of the time) for massive speedup. For semantic search, that tradeoff is almost always worth it.

On top of search, modern vector DBs offer:

Metadata filtering — "find similar vectors where category = 'docs' and updated_at > 2025-01-01"
Hybrid search — combine vector similarity with keyword (BM25) scores
Multi-tenancy — isolate vectors by user/team/workspace
Reranking integration — feed top-K to a reranker model for better order
Persistence and backup — durable storage, replication

The 2026 landscape

The options break into three groups:

Postgres extensions:

pgvector — the default. Open-source extension, runs in any Postgres (including Supabase, Neon, RDS). Up to ~10M rows comfortably. Same DB as your app data, no separate ops.
pgvectorscale — Timescale's extension on top of pgvector, adds StreamingDiskANN index for billion-scale.

Dedicated managed:

Pinecone — original cloud vector DB, mature, scales to billions, $70/mo starter.
Weaviate Cloud — open-source core with managed offering, modular schemas.
Qdrant Cloud — Rust-based, fast, generous free tier.
Milvus / Zilliz Cloud — large-scale, more complex.

Embedded / lightweight:

Chroma — embedded Python library, dev-friendly. Good for prototypes, less so for production.
DuckDB / SQLite vector extensions — embedded, single-file, great for offline or edge.
LanceDB — columnar vector storage, fast scanning, growing in popularity.

How to actually choose

A decision tree that works for most people:

Are you starting out, with < 10M vectors, and already using Postgres? Use pgvector. You keep one DB, ship faster, and most apps never outgrow it.
Do you need 50M+ vectors with strict latency SLAs and zero ops budget? Use Pinecone or Qdrant Cloud.
Do you self-host everything and want open-source? Qdrant or Weaviate.
Are you on the edge, mobile, or local-first? SQLite vector extension or LanceDB.
Are you embedding everything in a Python app for prototyping? Chroma is the lowest-friction choice.

The over-engineering trap: founders building their first RAG app pick Pinecone because it's the "AI-native" choice, then discover their app would have been simpler with pgvector inside the same Postgres they already had. Most people regret this in the second month.

Real-world performance numbers

To set expectations (2026 generation hardware):

pgvector with HNSW on a Supabase Pro tier: 100K vectors at 1,536 dims → 5-20ms p95 query latency, $25/mo (the Supabase plan, not the vector DB itself).
pgvector on the same hardware at 10M vectors: 30-100ms p95.
Pinecone serverless at 10M vectors: 20-50ms p95, but ~$70-200/mo depending on usage.
Qdrant self-hosted on a $20/mo VPS: 100K vectors → 2-10ms.

The pattern: pgvector is fast enough for ~95% of apps, and the gap to dedicated DBs only matters at scale or under tight latency SLAs.

What about the metadata problem

A common pitfall: you embed documents, store them in a vector DB, then realize you need to filter by tenant_id, created_at, category. Some vector DBs handle this elegantly; others don't.

pgvector inherits Postgres's full SQL: any WHERE clause works perfectly. Best metadata story.
Pinecone, Qdrant, Weaviate: have their own metadata filter syntaxes, often slightly limited.
Chroma: simple metadata filtering, works for basic cases.

If your retrieval needs structured filters more than "top-K vectors with no constraints," pgvector tends to win.

When NOT to use a vector DB at all

You have < 1,000 documents. Skip the vector store entirely. Embed at request time and brute-force-compare in memory, or just put all docs in the LLM's context window.
Your queries are mostly exact match. Use Postgres FTS (full-text search) or BM25; vector overkill.
Your data fits in RAM and stays small. A Python dict and numpy.argmax is sometimes the right answer.
You're not at the retrieval-quality wall yet. If your bottleneck is bad chunking or wrong embedding model, a fancier DB doesn't fix it.

Migration cost

A practical concern: switching vector DBs is annoying but not catastrophic. Re-embedding is usually unnecessary — you export vectors + metadata, import to the new DB. Most teams do this once, when scale demands it. Plan to start with pgvector and migrate if/when needed; don't pre-optimize.