brinicle

brinicle is a C++ retrieval engine with a Python API, built around disk-first, low-RAM HNSW search. It supports three search engines:

VectorEngine for raw vector similarity search
ItemSearchEngine for lexical, semantic, and hybrid item search
AutocompleteEngine for autocomplete and query suggestions

brinicle is designed for large indexes and constrained environments where keeping the full index in RAM is not practical.

Key Features

Disk-first HNSW vector search — indexes live on disk, not in RAM
Low-RAM indexing and querying — operates in environments with as little as 256MB RAM
Streaming-first ingest — one vector, item, or suggestion at a time; no need to load the full dataset into memory
Three specialized engines — raw vector search, structured item search, and autocomplete
Lexical, semantic, and hybrid search — ItemSearchEngine supports lexical-only, semantic-only, and hybrid search with configurable alpha
Insert, upsert, delete, and compact rebuild — full lifecycle management for your indexes
Batch search — run multiple queries in parallel across all engines
Custom scoring — configurable lexical scoring for item search and autocomplete
Multiple distance functions — L2, cosine distance, and dot product distance for VectorEngine
Python bindings over a C++ core — high performance with an easy-to-use Python API
HTTP server — deploy as a standalone service with the built-in FastAPI server
Multi-language SDKs — official clients for PHP, Laravel, TypeScript, Python, and Go

When to Use brinicle

brinicle is designed for datasets under 10M vectors in environments with tight RAM constraints. It excels in the following scenarios:

Low-cost deployments — when you need vector search but can’t justify the cost of high-RAM instances
Edge computing — when your service runs on resource-constrained edge machines
Tight containers — when your Docker containers have strict memory limits
Small to medium datasets — when you have up to 10M vectors and need efficient ANN search
Structured catalog search — when you need to search products, movies, books, or other structured items with lexical, semantic, or hybrid search
Autocomplete and suggestions — when you need low-RAM query or title suggestions

Choosing an Engine

Engine	Use it for
`VectorEngine`	Raw vector similarity search
`ItemSearchEngine`	Product, catalog, or structured item search
`AutocompleteEngine`	Autocomplete, title suggestions, and query suggestions

Use VectorEngine when you already have vectors. Use ItemSearchEngine when your records have titles, categories, attributes, and optional semantic vectors. Use AutocompleteEngine when you want prefix-like suggestions for queries, titles, or curated phrases.

Performance

In a 256MB RAM / 1 CPU container on MNIST 60K vectors, brinicle passed while Qdrant, Weaviate, and Milvus were all OOMKilled. On SIFT 1M vectors, brinicle achieves recall and latency competitive with FAISS and hnswlib while keeping the index disk-backed.

System	Build (s)	Recall@10	Avg Latency (ms)	QPS
FAISS	237.3	0.970	0.092	10,857
hnswlib	241.3	0.964	0.093	10,712
brinicle	243.8	0.970	0.103	9,731

License

brinicle is licensed under the Apache License, Version 2.0.

​brinicle

​Key Features

​When to Use brinicle

​Choosing an Engine

​Performance