Vector databases have become a religion. People pick them based on Hacker News threads or benchmark charts that are mostly meaningless at the scale a real product operates. Here's the honest framework: most products under 10M vectors should use pgvector. Most products above that should use Qdrant or Pinecone. Almost nobody needs the rest.
pgvector: the right default for 90% of products
If you have a Postgres database (and at this point, who doesn't), pgvector is almost certainly your right answer. It's a Postgres extension; you keep all your existing tooling — backups, replication, transactions, joins. You can store your vectors right next to the source data they reference. For most production AI features (semantic search over your existing tables, RAG over your help docs, recommendation embeddings) this is unbeatable.
Benchmarks: pgvector with HNSW indexes handles tens of millions of vectors comfortably on a normal Postgres box. The 2025-2026 generation of pgvector improvements (HNSW, parallel index build, ScaNN-style quantization in pgvectorscale) closed most of the performance gap with dedicated vector DBs.
When pgvector breaks: > 50M vectors with strict latency requirements, or extremely high write throughput (millions of vectors/hour streaming in). You'll know if you're at that scale.
Qdrant: the best dedicated vector DB for self-hosting
Qdrant is the strongest open-source dedicated vector DB. Written in Rust, fast, easy to deploy (single binary or Docker), and the cloud product is reasonably priced. The filtering story is the best in the industry — you can do complex metadata filtering pre-vector-search efficiently.
Use Qdrant when: pgvector hits its limits, you want hybrid search (BM25 + vector) baked in, or you specifically need advanced filtering. Qdrant Cloud handles the ops. Self-hosted Qdrant on a $20/month VPS handles a million vectors fine.
Weakness: ecosystem and community is smaller than Pinecone. Less Stack Overflow material. The product changes fast — what you read in a tutorial from 2024 is often outdated.
Pinecone: the right choice for ops-poor teams
Pinecone is the most expensive option and also the easiest. You don't think about your DB. It scales automatically. The reliability is genuinely excellent. If your team has more money than ops capacity (most early-stage startups), Pinecone removes a class of headaches.
Use Pinecone when: you need it to just work, you're at scale (>10M vectors), or you don't have anyone who wants to run infra. The serverless tier removed the old "$70/month minimum" cliff and now you can start cheap and scale up.
Weakness: cost. At scale Pinecone is 5-10× more expensive than self-hosted Qdrant for similar workloads. Vendor lock-in: switching out is non-trivial because the API and indexing options differ from open standards.
Weaviate: hybrid search built in, but heavy
Weaviate is opinionated. It bundles its own data modeling layer, has hybrid search and reranking baked in, supports modules for embedding generation, and is genuinely well-designed. Open-source self-hosting works well; the cloud product is solid.
Use Weaviate when: you want hybrid search and built-in modules without integrating five tools, or you need GraphQL queries on your vectors (rare). It's been losing ground to simpler tools recently — the all-in-one approach is less popular than "Postgres + a small specialist for the part where it matters."
Milvus, Chroma, LanceDB, MongoDB Atlas, Elastic
- Milvus — the high-end choice for billion-scale vector workloads at companies like Tencent and ByteDance. Operationally heavy. Don't use it unless you need that scale.
- Chroma — the best DX for prototyping. Embedded, no server, very pythonic. Use for Jupyter notebooks and demos. Switch to something else for production.
- LanceDB — interesting architecture (data lives on S3, multimodal-friendly). Worth watching. Production maturity is improving but still behind Qdrant.
- MongoDB Atlas Vector Search — fine if you're already on MongoDB Atlas. Don't migrate to MongoDB just for this.
- Elastic / OpenSearch — fine if you're already running Elastic. Don't migrate just for this either.
- Redis Vector Search — fast, in-memory, expensive. Niche use cases (real-time agents needing sub-10ms vector queries).
What benchmark charts don't tell you
The benchmarks vendors publish ("100M vectors at 5ms p99") almost never match your workload. Real performance depends on:
- Your filter cardinality (how many vectors match your metadata filter pre-search)
- Your update pattern (frequent writes invalidate HNSW caches)
- Your dimension count (1536 vs 768 vs 384 changes everything)
- Your hardware (NVMe SSDs vs network-attached storage matters more than the DB)
If benchmarks are guiding your decision, run your own with your data. "My DB hits 5ms" without context is meaningless.
When NOT to use a vector DB at all
- Under 10k vectors: just keep them in memory in a Python list and use NumPy. Adding a DB adds ops cost without benefit.
- Static dataset that doesn't change: precompute everything, store in Parquet, use FAISS in-process. No DB needed.
- Long-context model can fit your data: with Gemini 2.5 Pro's 1M+ context, you can sometimes skip retrieval entirely on small corpora.
- Keyword search would actually work better: don't vectorize when ILIKE or BM25 solves your problem at zero cost.
Many products that "need" RAG actually need full-text search and a smart prompt. Try the simpler thing first.
Migration cost: pick something you can leave
Lock-in matters. The honest hierarchy:
- pgvector: zero lock-in (just SQL, easy to dump and move)
- Qdrant self-hosted: low lock-in (open-source, multiple cloud providers)
- Qdrant Cloud / Weaviate Cloud: medium
- Pinecone: high (proprietary, custom indexing semantics)
If you're early and unsure, start with pgvector. The cost of moving from pgvector to anything else later is much lower than the cost of moving off Pinecone if you outgrow it.
Decision tree
- Already running Postgres, < 10M vectors: pgvector
- Want self-hosted, dedicated, hybrid search: Qdrant
- No ops capacity, willing to pay: Pinecone
- Need GraphQL or strongly opinionated stack: Weaviate
- Prototyping in Jupyter: Chroma
- Billion-scale internal Chinese tech co: Milvus
Next steps
- Try pgvector first if you have Postgres — it's a 5-minute install
- Read about HNSW vs IVF indexing to understand tuning knobs
- Look into hybrid search (BM25 + vector) — it almost always beats pure vector
- Set up evals so you can compare DBs on your actual data, not vendor benchmarks