brinicle
brinicle is a C++ retrieval engine with a Python API, built around disk-first, low-RAM HNSW search. It supports three search engines:VectorEnginefor raw vector similarity searchItemSearchEnginefor lexical, semantic, and hybrid item searchAutocompleteEnginefor autocomplete and query suggestions
Key Features
- Disk-first HNSW vector search — indexes live on disk, not in RAM
- Low-RAM indexing and querying — operates in environments with as little as 256MB RAM
- Streaming-first ingest — one vector, item, or suggestion at a time; no need to load the full dataset into memory
- Three specialized engines — raw vector search, structured item search, and autocomplete
- Lexical, semantic, and hybrid search — ItemSearchEngine supports lexical-only, semantic-only, and hybrid search with configurable
alpha - Insert, upsert, delete, and compact rebuild — full lifecycle management for your indexes
- Batch search — run multiple queries in parallel across all engines
- Custom scoring — configurable lexical scoring for item search and autocomplete
- Multiple distance functions — L2, cosine distance, and dot product distance for VectorEngine
- Python bindings over a C++ core — high performance with an easy-to-use Python API
- HTTP server — deploy as a standalone service with the built-in FastAPI server
- Multi-language SDKs — official clients for PHP, Laravel, TypeScript, Python, and Go
When to Use brinicle
brinicle is designed for datasets under 10M vectors in environments with tight RAM constraints. It excels in the following scenarios:- Low-cost deployments — when you need vector search but can’t justify the cost of high-RAM instances
- Edge computing — when your service runs on resource-constrained edge machines
- Tight containers — when your Docker containers have strict memory limits
- Small to medium datasets — when you have up to 10M vectors and need efficient ANN search
- Structured catalog search — when you need to search products, movies, books, or other structured items with lexical, semantic, or hybrid search
- Autocomplete and suggestions — when you need low-RAM query or title suggestions
Choosing an Engine
| Engine | Use it for |
|---|---|
VectorEngine | Raw vector similarity search |
ItemSearchEngine | Product, catalog, or structured item search |
AutocompleteEngine | Autocomplete, title suggestions, and query suggestions |
VectorEngine when you already have vectors.
Use ItemSearchEngine when your records have titles, categories, attributes, and optional semantic vectors.
Use AutocompleteEngine when you want prefix-like suggestions for queries, titles, or curated phrases.
Performance
In a 256MB RAM / 1 CPU container on MNIST 60K vectors, brinicle passed while Qdrant, Weaviate, and Milvus were all OOMKilled. On SIFT 1M vectors, brinicle achieves recall and latency competitive with FAISS and hnswlib while keeping the index disk-backed.| System | Build (s) | Recall@10 | Avg Latency (ms) | QPS |
|---|---|---|---|---|
| FAISS | 237.3 | 0.970 | 0.092 | 10,857 |
| hnswlib | 241.3 | 0.964 | 0.093 | 10,712 |
| brinicle | 243.8 | 0.970 | 0.103 | 9,731 |