Python SDK

brinicle provides two Python packages:

brinicle — The core library with in-process C++ engine (installed via pip install brinicle)
brinicle-client — HTTP client for the brinicle server (installed via pip install brinicle-client)

Core Library

The core library provides direct in-process access to the vector engine without needing a separate server.

Installation

pip install brinicle

Usage

import numpy as np
import brinicle

# Vector search
engine = brinicle.VectorEngine("vector_index", dim=384)
engine.init(mode="build")
for i in range(1000):
    engine.ingest(str(i), np.random.randn(384).astype(np.float32))
engine.finalize()
results = engine.search(np.random.randn(384).astype(np.float32), k=10)

# Item search
item_engine = brinicle.ItemSearchEngine("item_index", dim=96)
item_engine.init(mode="build")
item_engine.ingest("p1", title="Apple iPhone 15", category="Electronics")
item_engine.finalize()
results = item_engine.search("iphone", k=5)

# Autocomplete
ac = brinicle.AutocompleteEngine("ac_index", dim=48)
ac.init(mode="build")
ac.ingest("iphone 15 pro max", "iphone 15 pro max")
ac.finalize()
results = ac.search("iph", k=5)

See the VectorEngine, ItemSearchEngine, and AutocompleteEngine documentation for detailed usage.

HTTP Client

The HTTP client communicates with a brinicle server instance and supports both synchronous and asynchronous operations.

Installation

pip install brinicle-client

Synchronous Client

from brinicle_client import BrinicleClient
import numpy as np

with BrinicleClient("http://localhost:1984") as client:
    # Create an index
    client.create_index("my_index", dim=128)

    # Initialize and ingest
    client.init("my_index", "build")
    client.ingest("my_index", "v1", np.random.randn(128).tolist())
    client.ingest("my_index", "v2", np.random.randn(128).tolist())
    client.finalize("my_index", optimize=True)

    # Search
    results = client.search("my_index", np.random.randn(128).astype(np.float32), k=5)
    print(results)  # ['v1', 'v2']

Asynchronous Client

from brinicle_client import BrinicleClientAsync
import numpy as np

async def main():
    async with BrinicleClientAsync("http://localhost:1984") as client:
        await client.create_index("my_index", dim=128)

        await client.init("my_index", "build")
        await client.ingest("my_index", "v1", np.random.randn(128).tolist())
        await client.finalize("my_index", optimize=True)

        results = await client.search(
            "my_index",
            np.random.randn(128).astype(np.float32),
            k=5,
        )
        print(results)

import asyncio
asyncio.run(main())

API Reference

Both sync and async clients provide the same methods:

Vector Engine Methods

Method	Description
`health_check()`	Check server status
`list_indexes()`	List all indexes
`create_index(index_name, dim, delta_ratio, params)`	Create a new index
`delete_index(index_name, destroy)`	Delete or close an index
`get_index_status(index_name)`	Get index status
`load_index(index_name)`	Load existing index from disk
`init(index_name, mode)`	Initialize ingest session
`ingest(index_name, external_id, vector)`	Ingest a single vector
`ingest_batch(index_name, ids, vectors, dim)`	Batch ingest (binary)
`finalize(index_name, optimize, params)`	Finalize ingest session
`delete_items(index_name, external_ids, return_not_found)`	Delete items
`rebuild(index_name, params)`	Rebuild index compact
`search(index_name, query_vector, k, efs)`	Search for nearest neighbors
`optimize(index_name)`	Optimize HNSW graph

ItemSearch Methods

Method	Description
`create_item_index(index_name, dim, delta_ratio, params, lexical_config)`	Create an item search index
`init_item_ingest(index_name, mode)`	Initialize item ingest session
`ingest_item(index_name, external_id, vector, fields)`	Ingest a single item with metadata
`search_items(index_name, query, filters, k, efs)`	Search items with text query and filters
`delete_item_index(index_name, destroy)`	Delete or close an item index
`get_item_index_status(index_name)`	Get item index status

Autocomplete Methods

Method	Description
`create_autocomplete_index(index_name, dim, delta_ratio, params, autocomplete_config)`	Create an autocomplete index
`init_autocomplete_ingest(index_name, mode)`	Initialize autocomplete ingest session
`ingest_autocomplete(index_name, key, vector)`	Ingest a single autocomplete entry
`search_autocomplete(index_name, query, k)`	Search for autocomplete suggestions
`delete_autocomplete_index(index_name, destroy)`	Delete or close an autocomplete index
`get_autocomplete_index_status(index_name)`	Get autocomplete index status

ItemSearch (HTTP Client)

The HTTP client provides ItemSearch methods that communicate with the brinicle server’s ItemSearch API. ItemSearch enables structured search over items with lexical fields, combining vector similarity with metadata filtering. This is ideal for product catalogs, document collections, and any dataset where you need to search by both content and structured attributes. Unlike the basic vector search which operates on raw embeddings, ItemSearch lets you ingest items with human-readable fields such as titles, descriptions, and categories, then perform filtered searches that combine full-text matching with vector similarity. Each item index is backed by its own vector index, so you still benefit from brinicle’s high-performance HNSW graph for similarity lookups. The difference is that ItemSearch adds a lexical configuration layer that controls how text fields are tokenized and matched, allowing you to fine-tune relevance for your specific domain.

Creating an Item Index

Use create_item_index to create a new item search index with an optional lexical configuration. The lexical_config parameter controls how text fields are analyzed during ingestion and search — you can specify which fields are searchable, how they are tokenized, and whether stemming or stop-word removal should be applied. If omitted, a default configuration is used that treats all text fields as searchable with standard tokenization.

from brinicle_client import BrinicleClient, BrinicleClientAsync

# Synchronous
with BrinicleClient("http://localhost:1984") as client:
    lexical_config = {
        "searchable_fields": ["title", "description", "category"],
        "tokenizer": "standard",
        "lowercase": True,
        "stem": False,
        "remove_stop_words": True,
    }

    client.create_item_index(
        "products",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        lexical_config=lexical_config,
    )

# Asynchronous
async with BrinicleClientAsync("http://localhost:1984") as client:
    await client.create_item_index(
        "products",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        lexical_config=lexical_config,
    )

Ingesting Items

Initialize an item ingest session with init_item_ingest, then ingest items with their vectors and metadata fields using ingest_item. Each item consists of an external ID, a vector matching the index dimension, and a dictionary of string fields. Fields listed in the searchable_fields of your lexical_config will be indexed for full-text search, while other fields are stored but not searchable. After all items are ingested, call finalize to build the index.

with BrinicleClient("http://localhost:1984") as client:
    client.init_item_ingest("products", "build")

    client.ingest_item(
        "products",
        external_id="p1",
        vector=[0.1] * 384,
        fields={
            "title": "Apple iPhone 15 Pro Max",
            "description": "Latest iPhone with A17 Pro chip",
            "category": "Electronics",
            "price": "1199",
        },
    )

    client.ingest_item(
        "products",
        external_id="p2",
        vector=[0.2] * 384,
        fields={
            "title": "Samsung Galaxy S24 Ultra",
            "description": "Flagship Android with AI features",
            "category": "Electronics",
            "price": "1299",
        },
    )

    client.finalize("products", optimize=True)

Searching Items

Search for items using a text query with optional structured filters via search_items. The query parameter provides the text query which is matched against the searchable fields defined in the lexical configuration. The filters parameter accepts structured conditions with must and must_not keys that narrow results by exact field values or ranges. You can control the number of results with k and the search beam width with efs for fine-grained recall/latency trade-offs.

with BrinicleClient("http://localhost:1984") as client:
    # Simple text search
    results = client.search_items("products", query="iphone", k=10)
    # {'results': [...], 'total': 5}

    # Search with filters
    results = client.search_items(
        "products",
        query="smartphone",
        filters={
            "must": {"category": "Electronics"},
            "must_not": {"price": ["0", "100"]},
        },
        k=10,
        efs=64,
    )

    for result in results["results"]:
        print(f"{result['external_id']} (score: {result['score']})")
        print(f"Fields: {result['fields']}")

Managing Item Indexes

You can close or permanently destroy item indexes, and check their status at any time. Closing an index preserves its data on disk for later reloading, while destroying it permanently removes all data. The status check returns the index name, dimension, and whether a rebuild is needed after recent ingestions.

with BrinicleClient("http://localhost:1984") as client:
    # Check status
    status = client.get_item_index_status("products")
    # {'index_name': 'products', 'dim': 384, 'has_index': True, 'needs_rebuild': False}

    # Close (preserve data)
    client.delete_item_index("products")

    # Permanently destroy
    client.delete_item_index("products", destroy=True)

Autocomplete (HTTP Client)

The HTTP client provides Autocomplete methods that communicate with the brinicle server’s Autocomplete API. Autocomplete indexes are optimized for fast prefix matching, making them ideal for search boxes, tag suggestion, and any interface where users type partial queries and expect instant results. Unlike ItemSearch which combines vector similarity with text matching, Autocomplete focuses on sub-millisecond prefix lookups over string keys. Each autocomplete index stores a set of string keys and their associated vectors. When a user types a partial query, the engine searches for keys that begin with the typed prefix and returns the closest matches ranked by vector similarity. This allows you to build autocomplete experiences that are both fast and semantically relevant — for example, typing “iph” could suggest “iPhone 15 Pro Max” based on both the prefix match and the semantic proximity to other electronics products.

Creating an Autocomplete Index

Use create_autocomplete_index to create a new autocomplete index with an optional autocomplete configuration. The autocomplete_config parameter controls how suggestions are matched and ranked — for example, you can set the minimum prefix length before suggestions are returned, the maximum number of suggestions, and whether to apply fuzzy matching for typo tolerance. If omitted, sensible defaults are used that work well for most use cases.

with BrinicleClient("http://localhost:1984") as client:
    autocomplete_config = {
        "min_prefix_length": 1,
        "max_suggestions": 10,
        "fuzzy_match": True,
        "fuzzy_distance": 2,
    }

    client.create_autocomplete_index(
        "product_suggestions",
        dim=384,
        delta_ratio=0.10,
        params={"M": 48, "ef_construction": 1024, "ef_search": 512},
        autocomplete_config=autocomplete_config,
    )

Ingesting Autocomplete Entries

Initialize an autocomplete ingest session with init_autocomplete_ingest, then ingest entries using ingest_autocomplete. Each entry consists of a key string and its associated vector. The key is the text that users will search for by prefix — for example, a product name. The vector provides the semantic representation used to rank suggestions when multiple prefix matches are found. After all entries are ingested, call finalize to commit the data and build the prefix index.

with BrinicleClient("http://localhost:1984") as client:
    client.init_autocomplete_ingest("product_suggestions", "build")

    client.ingest_autocomplete("product_suggestions", "iphone 15 pro max", [0.1] * 384)
    client.ingest_autocomplete("product_suggestions", "samsung galaxy s24 ultra", [0.2] * 384)
    client.ingest_autocomplete("product_suggestions", "google pixel 8 pro", [0.3] * 384)

    client.finalize("product_suggestions", optimize=True)

Searching Autocomplete Suggestions

Search for autocomplete suggestions by providing a partial query string via search_autocomplete. Results are ranked by a combination of prefix match quality and vector similarity, ensuring that the most relevant suggestions appear first. The method is designed for sub-millisecond response times, making it suitable for interactive type-ahead interfaces.

with BrinicleClient("http://localhost:1984") as client:
    suggestions = client.search_autocomplete("product_suggestions", query="iph", k=5)
    # ['iphone 15 pro max', 'iphone 15 case', 'iphone charger']

Managing Autocomplete Indexes

Manage autocomplete indexes by closing or destroying them, and checking their status. The status check returns the index name, dimension, whether the prefix index is built, and whether a rebuild is needed. This is particularly important for autocomplete indexes because the prefix index must be rebuilt after ingestion to reflect new entries.

with BrinicleClient("http://localhost:1984") as client:
    # Check status
    status = client.get_autocomplete_index_status("product_suggestions")
    # {'index_name': 'product_suggestions', 'dim': 384, 'has_index': True, 'needs_rebuild': False}

    # Close (preserve data)
    client.delete_autocomplete_index("product_suggestions")

    # Permanently destroy
    client.delete_autocomplete_index("product_suggestions", destroy=True)

Error Handling

from brinicle_client.errors import (
    BrinicleError,
    ConnectionError,
    ValidationError,
    NotFoundError,
    ConflictError,
)

try:
    client.create_index("my_index", dim=128)
except ValidationError as e:
    print(f"Validation error: {e}")
except NotFoundError as e:
    print(f"Not found: {e}")
except ConflictError as e:
    print(f"Conflict: {e}")
except ConnectionError as e:
    print(f"Connection error: {e}")
except BrinicleError as e:
    print(f"API error ({e.status_code}): {e}")

Choosing Between Core and Client

Aspect	Core Library (`brinicle`)	HTTP Client (`brinicle-client`)
Architecture	In-process	Client-server
Latency	Lowest (no network)	Network overhead
Deployment	Embedded in your app	Separate server process
Language	Python only	Any language (via HTTP)
Concurrency	GIL-limited	Server handles concurrency
Use case	Single-app, low-latency	Multi-service, microservices

​Python SDK

​Core Library

​Installation

​Usage

​HTTP Client

​Installation

​Synchronous Client

​Asynchronous Client

​API Reference

​Vector Engine Methods

​ItemSearch Methods

​Autocomplete Methods

​ItemSearch (HTTP Client)

​Creating an Item Index

​Ingesting Items

​Searching Items

​Managing Item Indexes

​Autocomplete (HTTP Client)

​Creating an Autocomplete Index

​Ingesting Autocomplete Entries

​Searching Autocomplete Suggestions

​Managing Autocomplete Indexes

​Error Handling

​Choosing Between Core and Client

Python SDK

Core Library

Installation

Usage

HTTP Client

Installation

Synchronous Client

Asynchronous Client

API Reference

Vector Engine Methods

ItemSearch Methods

Autocomplete Methods

ItemSearch (HTTP Client)

Creating an Item Index

Ingesting Items

Searching Items

Managing Item Indexes

Autocomplete (HTTP Client)

Creating an Autocomplete Index

Ingesting Autocomplete Entries

Searching Autocomplete Suggestions

Managing Autocomplete Indexes

Error Handling

Choosing Between Core and Client