Python SDK

brinicle provides two Python packages:
  • brinicle — The core library with in-process C++ engine (installed via pip install brinicle)
  • brinicle-client — HTTP client for the brinicle server (installed via pip install brinicle-client)

Core Library

The core library provides direct in-process access to the vector engine without needing a separate server.

Installation

pip install brinicle

Usage

import numpy as np
import brinicle

# Vector search
engine = brinicle.VectorEngine("vector_index", dim=384)
engine.init(mode="build")
for i in range(1000):
    engine.ingest(str(i), np.random.randn(384).astype(np.float32))
engine.finalize()
results = engine.search(np.random.randn(384).astype(np.float32), k=10)

# Item search
item_engine = brinicle.ItemSearchEngine("item_index", dim=96)
item_engine.init(mode="build")
item_engine.ingest("p1", title="Apple iPhone 15", category="Electronics")
item_engine.finalize()
results = item_engine.search("iphone", k=5)

# Autocomplete
ac = brinicle.AutocompleteEngine("ac_index", dim=48)
ac.init(mode="build")
ac.ingest("iphone 15 pro max", "iphone 15 pro max")
ac.finalize()
results = ac.search("iph", k=5)
See the VectorEngine, ItemSearchEngine, and AutocompleteEngine documentation for detailed usage.

HTTP Client

The HTTP client communicates with a brinicle server instance and supports both synchronous and asynchronous operations.

Installation

pip install brinicle-client

Synchronous Client

from brinicle_client import BrinicleClient
import numpy as np

with BrinicleClient("http://localhost:1984") as client:
    # Create an index
    client.create_index("my_index", dim=128)

    # Initialize and ingest
    client.init("my_index", "build")
    client.ingest("my_index", "v1", np.random.randn(128).tolist())
    client.ingest("my_index", "v2", np.random.randn(128).tolist())
    client.finalize("my_index", optimize=True)

    # Search
    results = client.search("my_index", np.random.randn(128).astype(np.float32), k=5)
    print(results)  # ['v1', 'v2']

Asynchronous Client

from brinicle_client import BrinicleClientAsync
import numpy as np

async def main():
    async with BrinicleClientAsync("http://localhost:1984") as client:
        await client.create_index("my_index", dim=128)

        await client.init("my_index", "build")
        await client.ingest("my_index", "v1", np.random.randn(128).tolist())
        await client.finalize("my_index", optimize=True)

        results = await client.search(
            "my_index",
            np.random.randn(128).astype(np.float32),
            k=5,
        )
        print(results)

import asyncio
asyncio.run(main())

API Reference

Both sync and async clients provide the same methods:

Vector Engine Methods

MethodDescription
health_check()Check server status
list_indexes()List all indexes
create_index(index_name, dim, delta_ratio, params)Create a new index
delete_index(index_name, destroy)Delete or close an index
get_index_status(index_name)Get index status
load_index(index_name)Load existing index from disk
init(index_name, mode)Initialize ingest session
ingest(index_name, external_id, vector)Ingest a single vector
ingest_batch(index_name, ids, vectors, dim)Batch ingest (binary)
finalize(index_name, optimize, params)Finalize ingest session
delete_items(index_name, external_ids, return_not_found)Delete items
rebuild(index_name, params)Rebuild index compact
search(index_name, query_vector, k, efs)Search for nearest neighbors
optimize(index_name)Optimize HNSW graph

ItemSearch Methods

MethodDescription
create_item_index(index_name, dim, delta_ratio, params, lexical_config)Create an item search index
init_item_ingest(index_name, mode)Initialize item ingest session
ingest_item(index_name, external_id, vector, fields)Ingest a single item with metadata
search_items(index_name, query, filters, k, efs)Search items with text query and filters
delete_item_index(index_name, destroy)Delete or close an item index
get_item_index_status(index_name)Get item index status

Autocomplete Methods

MethodDescription
create_autocomplete_index(index_name, dim, delta_ratio, params, autocomplete_config)Create an autocomplete index
init_autocomplete_ingest(index_name, mode)Initialize autocomplete ingest session
ingest_autocomplete(index_name, key, vector)Ingest a single autocomplete entry
search_autocomplete(index_name, query, k)Search for autocomplete suggestions
delete_autocomplete_index(index_name, destroy)Delete or close an autocomplete index
get_autocomplete_index_status(index_name)Get autocomplete index status

ItemSearch (HTTP Client)

The HTTP client provides ItemSearch methods that communicate with the brinicle server’s ItemSearch API. ItemSearch enables structured search over items with lexical fields, combining vector similarity with metadata filtering. This is ideal for product catalogs, document collections, and any dataset where you need to search by both content and structured attributes. Unlike the basic vector search which operates on raw embeddings, ItemSearch lets you ingest items with human-readable fields such as titles, descriptions, and categories, then perform filtered searches that combine full-text matching with vector similarity. Each item index is backed by its own vector index, so you still benefit from brinicle’s high-performance HNSW graph for similarity lookups. The difference is that ItemSearch adds a lexical configuration layer that controls how text fields are tokenized and matched, allowing you to fine-tune relevance for your specific domain.

Creating an Item Index

Use create_item_index to create a new item search index with an optional lexical configuration. The lexical_config parameter controls how text fields are analyzed during ingestion and search — you can specify which fields are searchable, how they are tokenized, and whether stemming or stop-word removal should be applied. If omitted, a default configuration is used that treats all text fields as searchable with standard tokenization.
from brinicle_client import BrinicleClient, BrinicleClientAsync

# Synchronous
with BrinicleClient("http://localhost:1984") as client:
    lexical_config = {
        "searchable_fields": ["title", "description", "category"],
        "tokenizer": "standard",
        "lowercase": True,
        "stem": False,
        "remove_stop_words": True,
    }

    client.create_item_index(
        "products",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        lexical_config=lexical_config,
    )

# Asynchronous
async with BrinicleClientAsync("http://localhost:1984") as client:
    await client.create_item_index(
        "products",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        lexical_config=lexical_config,
    )

Ingesting Items

Initialize an item ingest session with init_item_ingest, then ingest items with their vectors and metadata fields using ingest_item. Each item consists of an external ID, a vector matching the index dimension, and a dictionary of string fields. Fields listed in the searchable_fields of your lexical_config will be indexed for full-text search, while other fields are stored but not searchable. After all items are ingested, call finalize to build the index.
with BrinicleClient("http://localhost:1984") as client:
    client.init_item_ingest("products", "build")

    client.ingest_item(
        "products",
        external_id="p1",
        vector=[0.1] * 384,
        fields={
            "title": "Apple iPhone 15 Pro Max",
            "description": "Latest iPhone with A17 Pro chip",
            "category": "Electronics",
            "price": "1199",
        },
    )

    client.ingest_item(
        "products",
        external_id="p2",
        vector=[0.2] * 384,
        fields={
            "title": "Samsung Galaxy S24 Ultra",
            "description": "Flagship Android with AI features",
            "category": "Electronics",
            "price": "1299",
        },
    )

    client.finalize("products", optimize=True)

Searching Items

Search for items using a text query with optional structured filters via search_items. The query parameter provides the text query which is matched against the searchable fields defined in the lexical configuration. The filters parameter accepts structured conditions with must and must_not keys that narrow results by exact field values or ranges. You can control the number of results with k and the search beam width with efs for fine-grained recall/latency trade-offs.
with BrinicleClient("http://localhost:1984") as client:
    # Simple text search
    results = client.search_items("products", query="iphone", k=10)
    # {'results': [...], 'total': 5}

    # Search with filters
    results = client.search_items(
        "products",
        query="smartphone",
        filters={
            "must": {"category": "Electronics"},
            "must_not": {"price": ["0", "100"]},
        },
        k=10,
        efs=64,
    )

    for result in results["results"]:
        print(f"{result['external_id']} (score: {result['score']})")
        print(f"Fields: {result['fields']}")

Managing Item Indexes

You can close or permanently destroy item indexes, and check their status at any time. Closing an index preserves its data on disk for later reloading, while destroying it permanently removes all data. The status check returns the index name, dimension, and whether a rebuild is needed after recent ingestions.
with BrinicleClient("http://localhost:1984") as client:
    # Check status
    status = client.get_item_index_status("products")
    # {'index_name': 'products', 'dim': 384, 'has_index': True, 'needs_rebuild': False}

    # Close (preserve data)
    client.delete_item_index("products")

    # Permanently destroy
    client.delete_item_index("products", destroy=True)

Autocomplete (HTTP Client)

The HTTP client provides Autocomplete methods that communicate with the brinicle server’s Autocomplete API. Autocomplete indexes are optimized for fast prefix matching, making them ideal for search boxes, tag suggestion, and any interface where users type partial queries and expect instant results. Unlike ItemSearch which combines vector similarity with text matching, Autocomplete focuses on sub-millisecond prefix lookups over string keys. Each autocomplete index stores a set of string keys and their associated vectors. When a user types a partial query, the engine searches for keys that begin with the typed prefix and returns the closest matches ranked by vector similarity. This allows you to build autocomplete experiences that are both fast and semantically relevant — for example, typing “iph” could suggest “iPhone 15 Pro Max” based on both the prefix match and the semantic proximity to other electronics products.

Creating an Autocomplete Index

Use create_autocomplete_index to create a new autocomplete index with an optional autocomplete configuration. The autocomplete_config parameter controls how suggestions are matched and ranked — for example, you can set the minimum prefix length before suggestions are returned, the maximum number of suggestions, and whether to apply fuzzy matching for typo tolerance. If omitted, sensible defaults are used that work well for most use cases.
with BrinicleClient("http://localhost:1984") as client:
    autocomplete_config = {
        "min_prefix_length": 1,
        "max_suggestions": 10,
        "fuzzy_match": True,
        "fuzzy_distance": 2,
    }

    client.create_autocomplete_index(
        "product_suggestions",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        autocomplete_config=autocomplete_config,
    )

Ingesting Autocomplete Entries

Initialize an autocomplete ingest session with init_autocomplete_ingest, then ingest entries using ingest_autocomplete. Each entry consists of a key string and its associated vector. The key is the text that users will search for by prefix — for example, a product name. The vector provides the semantic representation used to rank suggestions when multiple prefix matches are found. After all entries are ingested, call finalize to commit the data and build the prefix index.
with BrinicleClient("http://localhost:1984") as client:
    client.init_autocomplete_ingest("product_suggestions", "build")

    client.ingest_autocomplete("product_suggestions", "iphone 15 pro max", [0.1] * 384)
    client.ingest_autocomplete("product_suggestions", "samsung galaxy s24 ultra", [0.2] * 384)
    client.ingest_autocomplete("product_suggestions", "google pixel 8 pro", [0.3] * 384)

    client.finalize("product_suggestions", optimize=True)

Searching Autocomplete Suggestions

Search for autocomplete suggestions by providing a partial query string via search_autocomplete. Results are ranked by a combination of prefix match quality and vector similarity, ensuring that the most relevant suggestions appear first. The method is designed for sub-millisecond response times, making it suitable for interactive type-ahead interfaces.
with BrinicleClient("http://localhost:1984") as client:
    suggestions = client.search_autocomplete("product_suggestions", query="iph", k=5)
    # ['iphone 15 pro max', 'iphone 15 case', 'iphone charger']

Managing Autocomplete Indexes

Manage autocomplete indexes by closing or destroying them, and checking their status. The status check returns the index name, dimension, whether the prefix index is built, and whether a rebuild is needed. This is particularly important for autocomplete indexes because the prefix index must be rebuilt after ingestion to reflect new entries.
with BrinicleClient("http://localhost:1984") as client:
    # Check status
    status = client.get_autocomplete_index_status("product_suggestions")
    # {'index_name': 'product_suggestions', 'dim': 384, 'has_index': True, 'needs_rebuild': False}

    # Close (preserve data)
    client.delete_autocomplete_index("product_suggestions")

    # Permanently destroy
    client.delete_autocomplete_index("product_suggestions", destroy=True)

Error Handling

from brinicle_client.errors import (
    BrinicleError,
    ConnectionError,
    ValidationError,
    NotFoundError,
    ConflictError,
)

try:
    client.create_index("my_index", dim=128)
except ValidationError as e:
    print(f"Validation error: {e}")
except NotFoundError as e:
    print(f"Not found: {e}")
except ConflictError as e:
    print(f"Conflict: {e}")
except ConnectionError as e:
    print(f"Connection error: {e}")
except BrinicleError as e:
    print(f"API error ({e.status_code}): {e}")

Choosing Between Core and Client

AspectCore Library (brinicle)HTTP Client (brinicle-client)
ArchitectureIn-processClient-server
LatencyLowest (no network)Network overhead
DeploymentEmbedded in your appSeparate server process
LanguagePython onlyAny language (via HTTP)
ConcurrencyGIL-limitedServer handles concurrency
Use caseSingle-app, low-latencyMulti-service, microservices