Technical Documentation

turbopuffer exposes a straightforward HTTP API with the following core primitives:

Write — Upsert or delete documents (vectors + metadata attributes) into a namespace. Supports batch writes up to 10k writes/s per namespace (32 MB/s).
Query — Execute ANN vector search, BM25 full-text search, or hybrid search with metadata pre/post-filters and custom ranking. Returns top-k results with optional recall tuning.
Namespace metadata — Retrieve stats and configuration for a given namespace.
Authentication — All requests require a bearer token passed via standard HTTP Authorization header.

Architecture: Writes are durably persisted to object storage (S3). A Memory/SSD cache layer is maintained per namespace for warm queries (p50 ~14ms on 10M × 1024-dim vectors). Cold namespaces hydrate on first access. Namespace branching provides instant copy-on-write snapshots for isolation, testing, or parallel workloads.

Limits (current production):

Max documents per namespace: 500M @ 2TB
Max namespaces: Unlimited
Max global write throughput: Unlimited
Vector search recall@10: 90–100%

SDKs and quickstart guides are available in the docs at /docs.