VectorEngine
VectorEngine is the raw vector search engine in Brinicle.
Use it when you already have embeddings or numeric vectors and want approximate nearest neighbor search through a disk-first HNSW index.
VectorEngine supports:
- build
- insert
- upsert
- delete
- single-query search
- batch search
- search with distances
- compact rebuild
- graph optimization
Constructor
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
index_path | str | required | Base path for the index files |
dim | int | required | Vector dimension |
delta_ratio | float | 0.10 | Maintenance threshold for delta and deleted records |
M | int | 16 | HNSW graph connectivity |
ef_construction | int | 200 | Build-time search width |
ef_search | int | 64 | Default query-time search width |
build_n_threads | int | 1 | Number of build threads |
seed | int | 0 | Random seed for graph construction |
dist_func | str | ”l2” | Distance function used by the index |
Choosing HNSW Parameters
The HNSW parameters control the trade-off between search quality, indexing speed, and memory usage:- M — Higher values improve recall but increase memory usage and indexing time. Values between 16 and 64 are common. For high-recall applications, use M=48 or higher.
- ef_construction — Higher values produce better graphs at the cost of slower builds. Values between 200 and 1024 are typical.
- ef_search — Higher values improve recall at the cost of slower queries. This can also be overridden per-query using the
efsparameter. - delta_ratio — Controls the size of the delta segment relative to the main segment. A value of 0.10 means the delta segment can grow to 10% of the main segment before requiring a merge.
Distance Functions
VectorEngine supports these distance functions:
dist_func | Meaning |
|---|---|
"l2" | Squared Euclidean distance |
"cosine_distance" | 1 - cosine_similarity(a, b) |
"dot_product_distance" | -dot_product(a, b) |
dot_product_distance, a larger dot product becomes a smaller distance:
-0.90 is ranked before the result with distance -0.20.
Building an Index
Usebuild mode to create a new index.
float32 arrays with the same dimension as the index.
Finalize Options
finalize(...) completes a pending build, insert, or upsert.
0 for build parameters uses the engine defaults.
When optimize=False, inserts and upserts are absorbed into the delta index.
When optimize=True, Brinicle may rebuild the index if the projected delta size crosses the maintenance threshold controlled by delta_ratio.
Search
Basic Search
Usesearch(...) to return external IDs only.
Search with Distance
Usesearch_with_distance(...) to return both IDs and distances.
Search Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
q | np.ndarray | required | Query vector (float32, 1-D) |
k | int | 10 | Maximum number of results |
efs | int | 64 | Query-time search width |
threshold | float | inf | Maximum accepted distance |
efs usually improves recall, but increases query latency.
Batch Search
Usesearch_batch(...) to search multiple query vectors.
queries must be a two-dimensional float32 array:
n_jobs controls parallel query execution when parallel execution is available.
Building from File
For very large datasets, you can build the index directly from a file:Index State
Utility Functions
brinicle also exposes some utility functions for distance computation and brute-force search:Complete API Reference
init
build, insert, upsert
ingest
init(...) before calling ingest(...).
finalize
search
search_with_distance
(external_id, distance) pairs.