Pinecone vs pgvector: Cost, Performance, and When to Switch

pgvector beats hosted vector DBs on cost at 1M-10M vectors, but the maths changes at 100M+. Here's an honest comparison with real numbers and decision rules.

By Andrii Votiakov on 2026-03-05

Pinecone is polished, well-documented, and priced for venture-backed startups burning through Series A money. If you're past that phase, the bill starts to look different. I've talked to teams paying $700-2,000/month for Pinecone on workloads that would run fine on pgvector at $50-200/month. I've also talked to teams where pgvector would genuinely struggle. Here's the honest version of both.

Quick answer

pgvector beats hosted vector databases (Pinecone, Weaviate Cloud, Qdrant Cloud) on cost at 1-10 million vectors, often by 70-85%. At 100 million vectors with high query concurrency, the gap narrows and the operational trade-offs start mattering. The decision isn't just cost — it's whether your team will maintain indexes, tune ef_search, and manage Postgres for a search-critical workload.

How Pinecone bills

Pinecone pricing (early 2026) is based on storage units and pods:

  • Serverless: priced per read unit and write unit. Roughly $0.04-0.05 per 1 million vectors queried per month, plus storage.
  • Pod-based (dedicated): p1.x1 starts around $70/month, scales with pods and replicas.

A typical production deployment: 5 million 1536-dimension vectors, ~5 million queries/month, serverless tier: roughly $400-600/month depending on query patterns.

A heavier deployment: 20 million vectors, 30 million queries/month on pod-based infrastructure: $800-1,500/month.

The cost per query doesn't feel like much. It compounds.

pgvector cost at 1M, 10M, and 100M vectors

The cost to run pgvector depends on the Postgres instance you're running it on. Here's what the maths looks like on AWS (RDS or Aurora):

1 million 1536-dim vectors:

  • Storage: ~6 GB (1536 dims × 4 bytes × 1M = ~6 GB raw, plus index overhead)
  • Instance needed: db.r6g.large (2 vCPU, 16 GB RAM) handles this comfortably with HNSW
  • Cost: ~$130/month on-demand, ~$85/month with 1-year RI
  • Pinecone equivalent: $80-150/month (serverless, low query volume) to $350+ (pod, moderate queries)

10 million 1536-dim vectors:

  • Storage: ~60 GB raw, HNSW index adds 1.5-2x overhead - plan for 150 GB
  • Instance needed: db.r6g.2xlarge (8 vCPU, 64 GB RAM) to keep HNSW graph in memory
  • Cost: ~$450/month on-demand, ~$295/month with 1-year RI
  • Pinecone equivalent: $400-900/month depending on query concurrency

100 million 1536-dim vectors:

  • Storage: ~600 GB raw, HNSW overhead pushes this to 1-1.5 TB
  • Instance needed: db.r6g.8xlarge or larger (64 GB RAM minimum for HNSW, ideally 128+ GB)
  • Cost: $1,500-3,000/month on-demand depending on size
  • Pinecone equivalent: $1,000-3,000+/month on pod infrastructure

At 100M vectors, pgvector and Pinecone are cost-competitive. The question becomes operational, not financial.

HNSW vs IVFFlat: which index type to use

pgvector supports two index types. The choice matters for both performance and cost.

HNSW (Hierarchical Navigable Small World):

  • Excellent query latency (1-5ms at 10M vectors with good tuning)
  • Builds are slower and memory-intensive (plan for 2-3x vector size in RAM during build)
  • Better recall at high speed: 95%+ with ef_search = 100
  • My recommendation for most production workloads
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- At query time, tune recall vs speed
SET hnsw.ef_search = 100;

IVFFlat (Inverted File Index):

  • Faster to build, lower memory overhead
  • Worse recall and latency than HNSW for most query patterns
  • Useful when: you're updating vectors frequently and can't afford HNSW rebuild time
  • Requires more careful lists tuning (roughly sqrt(row_count))
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 1000);

SET ivfflat.probes = 10;

For most teams: use HNSW. The better recall and query latency outweigh the build overhead unless you're doing very frequent bulk updates.

Where pgvector struggles

Be honest about the limitations before committing:

Query concurrency: Postgres handles concurrent queries well, but HNSW search is CPU-bound and doesn't parallelise within a single query. At 100+ concurrent vector queries per second, you'll need connection pooling (PgBouncer), read replicas, or both. Pinecone's serverless tier absorbs this transparently.

Index updates at scale: Adding vectors to a HNSW index incrementally is supported but slows down at large scale. If you're ingesting millions of vectors per day, the index performance degrades and periodic rebuilds are needed. Pinecone handles this cleanly.

RAM requirements: HNSW performs best when the graph fits in RAM. At 100M 1536-dim vectors, you need 128+ GB RAM for the index. That means large, expensive Postgres instances. If the HNSW index spills to disk, latency spikes significantly.

Operational complexity: You own backups, failover, vacuuming, and index maintenance. For search-critical workloads, that's real engineering responsibility.

Where pgvector genuinely wins

Joins with your application data: This is the pgvector advantage that doesn't get enough attention. If you want to filter by user ID, tenant, date range, or status before or after vector search, doing that inside Postgres is dramatically faster and simpler than filtering post-retrieval from Pinecone. One query, no round trips.

-- Filter before vector search
SELECT id, content, 1 - (embedding <=> $1) AS score
FROM documents
WHERE user_id = $2 AND created_at > NOW() - INTERVAL '90 days'
ORDER BY embedding <=> $1
LIMIT 20;

Pinecone supports metadata filtering, but it works differently — you can't do arbitrary SQL conditions, and complex filters degrade performance.

No new infrastructure to maintain: If you're already running Postgres, pgvector is just an extension. No new vendor, no new API keys, no new monitoring stack.

Data privacy and residency: Your vectors stay in your VPC. For applications handling sensitive documents — legal, medical, financial — keeping embeddings in the same Postgres instance as the source data simplifies compliance significantly.

The migration path from Pinecone

If you decide to migrate, the process is mechanical:

  1. Install pgvector: CREATE EXTENSION vector;
  2. Add embedding column to your existing table (or create a dedicated embeddings table)
  3. Backfill: export vectors from Pinecone, insert into Postgres in batches of 1,000-5,000
  4. Build the HNSW index (plan for a maintenance window or build CONCURRENTLY)
  5. Run both in parallel for a week — compare query result quality
  6. Flip application code to Postgres, decommission Pinecone namespace

The main effort is the backfill and testing. For 5 million vectors, expect 2-4 hours of data migration and 1-3 days of testing.

Realistic numbers

Three clients I've worked with on this:

Client A — 3M vectors, RAG pipeline for internal docs, ~2M queries/month:

  • Pinecone cost: $420/month
  • pgvector on db.r6g.xlarge RI: $95/month
  • Saving: $325/month, 77% reduction

Client B — 12M vectors, product recommendation engine, 15M queries/month:

  • Pinecone pod-based: $1,100/month
  • pgvector on db.r6g.4xlarge RI + PgBouncer: $380/month
  • Saving: $720/month, 65% reduction

Client C — 90M vectors, high-concurrency search product, 100M queries/month:

  • Pinecone: $2,400/month
  • pgvector viable technically but required db.x2g.4xlarge (128GB RAM) + 3 read replicas: $2,100/month
  • Saving: marginal. Kept Pinecone for operational simplicity.

Client C is the real lesson: pgvector is often the right answer, but not always.

See also: /blog/replacing-algolia-with-pgvector for combining pgvector with Postgres full-text search for hybrid search.


If you want me to run the cost and performance maths for your specific workload on a pay-for-savings basis, book a call.