Vector Search at Scale: Postgres pgvector for Real Production

pgvector quietly became the right answer for most vector workloads over 2023-2026. The substantial value proposition: store vectors in the same Postgres database that holds your operational data, query them with SQL, avoid the operational burden of a separate vector database. The pgvector 0.5+ HNSW index implementation produced substantial performance that’s competitive with dedicated vector databases for most workloads. This post walks through where pgvector shines, where Pinecone still wins, and how to operate it in production.

What pgvector provides#

pgvector is the Postgres extension that adds vector data type and similarity search.

Vector data type. Store vectors of arbitrary dimension as Postgres column type.

Distance operators. Cosine, L2, inner product distance operators.

HNSW index. Hierarchical Navigable Small World index for fast approximate nearest neighbor search.

IVFFlat index. Inverted file index — older option, still useful for some workloads.

SQL integration. Vector queries combine seamlessly with regular SQL — joins, filters, aggregations, plus the various.

Transactional semantics. Vector operations participate in Postgres transactions.

Where pgvector wins#

Several scenarios favor pgvector substantially:

Operational data plus vectors. When your application data (users, products, documents) lives in Postgres and you want vector search over related embeddings, pgvector avoids the integration complexity of separate vector DB.

Hybrid search. When queries combine vector similarity with structured filters (date ranges, user permissions, category filters), Postgres handles both natively.

Smaller-to-moderate scale. Up to ~100M vectors with reasonable performance on appropriate hardware.

Existing Postgres operations. When your team already has Postgres operational maturity, pgvector adds minimal operational burden.

ACID transactions. When vector updates need to be transactional with other data.

Cost. No additional vendor license; uses your existing Postgres infrastructure.

Where Pinecone (and others) still win#

Several scenarios still favor dedicated vector databases:

Substantial scale. Above ~100M vectors with high query rates, dedicated vector DBs typically perform better.

Maximum query throughput. Specialized vector engines (Pinecone, Weaviate, Qdrant, Milvus, plus the various) outperform pgvector at high QPS.

Specialized features. Hybrid sparse-dense search, learned indexes, advanced filtering with vector — some dedicated DBs have features pgvector doesn’t.

Managed service simplicity. Pinecone managed service removes operational burden; comparable Postgres managed services exist but require more configuration.

Cross-cluster vector replication. Specialized vector DBs handle this better.

The HNSW configuration#

pgvector HNSW index has substantial configuration:

m — connections per node. Higher = better recall but more storage. Default 16; production typically 16-32.

ef_construction — index build quality. Higher = better recall but slower build. Default 64; production typically 64-200.

ef_search (query-time) — controls recall vs speed trade-off. Higher = better recall but slower queries. Tune per workload.

Distance type. Cosine, L2, inner product — pick based on embedding model documentation.

Substantial tuning work; substantial production performance depends on it.

The operational realities#

Several substantial production realities:

Index build time. HNSW indexes on large tables can take hours to build. Production strategy matters — concurrent index builds, periodic rebuilds, or partition-based builds.

Memory requirements. HNSW indexes are memory-resident. Production sizing matters substantially.

Vacuum and bloat. Heavy update workloads produce table bloat that affects vector queries. Standard Postgres maintenance applies.

Read replica patterns. Vector queries typically read-heavy; replica patterns help.

Backup and restore. Vector data backs up like regular Postgres data; substantial size matters.

Monitoring. Standard Postgres monitoring plus vector-specific (recall, query latency distribution).

The cost comparison#

Rough comparison at scale:

Pinecone managed. ~$70/month base for small deployments; substantially more at scale. Predictable; operationally simple.

pgvector on AWS RDS. RDS Postgres instance with appropriate sizing. ~$200-$2000/month depending on scale. Substantial AWS-managed.

pgvector self-managed. EC2 + Postgres. ~$100-$1000/month. Operational burden.

Weaviate, Qdrant, Milvus managed. Comparable to Pinecone with variation.

Weaviate, Qdrant, Milvus self-hosted. Similar to pgvector self-managed economics.

For most workloads, pgvector on managed Postgres produces best economics with reasonable operational complexity.

The decision framework#

For most teams in 2026:

Pick pgvector when you already have substantial Postgres deployment and vector workload fits scale. Default modern choice.

Pick Pinecone when substantial scale exceeds pgvector practical limits, or when managed simplicity matters more than cost.

Pick Weaviate/Qdrant/Milvus for specific feature needs (hybrid search, specialized filtering) or when avoiding Postgres-anchored architecture.

Pick combinations in some scenarios — pgvector for operational vectors, dedicated DB for large embeddings corpus.

What we typically see at clients#

Common patterns:

pgvector as default. Most modern AI integrations start here.

Pinecone for greenfield AI products where managed simplicity matters.

Migration from Pinecone to pgvector at organizations where cost matters more than convenience.

Specialized vector DBs at substantial AI-anchored organizations with specific feature needs.

Where pdpspectra fits#

Our data engineering practice builds production AI systems with appropriate vector search architecture.

pgvector is the right default for most vector workloads. Talk to our team about your AI architecture.