brinicle

brinicle is a C++ retrieval engine with a Python API, built around disk-first, low-RAM HNSW search. It supports three search engines:
  • VectorEngine for raw vector similarity search
  • ItemSearchEngine for lexical, semantic, and hybrid item search
  • AutocompleteEngine for autocomplete and query suggestions
brinicle is designed for large indexes and constrained environments where keeping the full index in RAM is not practical.

Key Features

  • Disk-first HNSW vector search — indexes live on disk, not in RAM
  • Low-RAM indexing and querying — operates in environments with as little as 256MB RAM
  • Streaming-first ingest — one vector, item, or suggestion at a time; no need to load the full dataset into memory
  • Three specialized engines — raw vector search, structured item search, and autocomplete
  • Lexical, semantic, and hybrid search — ItemSearchEngine supports lexical-only, semantic-only, and hybrid search with configurable alpha
  • Insert, upsert, delete, and compact rebuild — full lifecycle management for your indexes
  • Batch search — run multiple queries in parallel across all engines
  • Custom scoring — configurable lexical scoring for item search and autocomplete
  • Multiple distance functions — L2, cosine distance, and dot product distance for VectorEngine
  • Python bindings over a C++ core — high performance with an easy-to-use Python API
  • HTTP server — deploy as a standalone service with the built-in FastAPI server
  • Multi-language SDKs — official clients for PHP, Laravel, TypeScript, Python, and Go

When to Use brinicle

brinicle is designed for datasets under 10M vectors in environments with tight RAM constraints. It excels in the following scenarios:
  • Low-cost deployments — when you need vector search but can’t justify the cost of high-RAM instances
  • Edge computing — when your service runs on resource-constrained edge machines
  • Tight containers — when your Docker containers have strict memory limits
  • Small to medium datasets — when you have up to 10M vectors and need efficient ANN search
  • Structured catalog search — when you need to search products, movies, books, or other structured items with lexical, semantic, or hybrid search
  • Autocomplete and suggestions — when you need low-RAM query or title suggestions

Choosing an Engine

EngineUse it for
VectorEngineRaw vector similarity search
ItemSearchEngineProduct, catalog, or structured item search
AutocompleteEngineAutocomplete, title suggestions, and query suggestions
Use VectorEngine when you already have vectors. Use ItemSearchEngine when your records have titles, categories, attributes, and optional semantic vectors. Use AutocompleteEngine when you want prefix-like suggestions for queries, titles, or curated phrases.

Performance

In a 256MB RAM / 1 CPU container on MNIST 60K vectors, brinicle passed while Qdrant, Weaviate, and Milvus were all OOMKilled. On SIFT 1M vectors, brinicle achieves recall and latency competitive with FAISS and hnswlib while keeping the index disk-backed.
SystemBuild (s)Recall@10Avg Latency (ms)QPS
FAISS237.30.9700.09210,857
hnswlib241.30.9640.09310,712
brinicle243.80.9700.1039,731

License

brinicle is licensed under the Apache License, Version 2.0.