Configuration

brinicle exposes several configuration options for tuning the HNSW algorithm and the lexical encoding system. Understanding these parameters helps you optimize for your specific use case — whether you need maximum recall, minimum latency, or balanced performance.

HNSW Parameters

The Hierarchical Navigable Small World (HNSW) algorithm has three primary parameters that control the trade-off between index quality, build time, and search performance.

M (Connectivity)

The M parameter controls the number of bi-directional links per node in the HNSW graph. Higher values create a more connected graph, which generally improves recall at the cost of increased memory usage and build time.

M Value	Memory	Build Speed	Recall	Use Case
16	Low	Fast	Good	General purpose, resource-constrained
32	Medium	Medium	Very Good	Balanced production workloads
48	High	Slow	Excellent	High-recall requirements
64+	Very High	Very Slow	Near-Perfect	Maximum accuracy, ample resources

Default: 16

ef_construction (Build-Time Search Width)

Controls the size of the dynamic candidate list during index construction. Higher values produce better-quality graphs but slow down the build process.

Value	Build Time	Graph Quality	Use Case
100-200	Fast	Good	Quick prototyping, small datasets
200-512	Medium	Very Good	Production workloads
512-1024	Slow	Excellent	High-recall requirements
1024+	Very Slow	Near-Perfect	Maximum quality builds

Default: 200

ef_search (Query-Time Search Width)

Controls the size of the dynamic candidate list during search queries. This is the primary knob for tuning the recall-speed trade-off at query time.

Value	Query Speed	Recall	Use Case
32-64	Fast	Good	Autocomplete, exploratory search
64-128	Medium	Very Good	Most production workloads
128-512	Slow	Excellent	Accuracy-critical applications
512+	Very Slow	Near-Perfect	Maximum recall

Default: 64 You can override ef_search per-query using the efs parameter in search calls, allowing you to balance speed and accuracy for different use cases within the same index.

delta_ratio

Controls the size of the delta segment relative to the main segment. The delta segment holds recently ingested vectors before they are merged into the main graph.

Value	Ingest Performance	Search Performance	Use Case
0.05	Slower merges	Best	Read-heavy workloads
0.10	Balanced	Good	General purpose
0.20-0.50	Fastest ingest	Degraded	Write-heavy workloads

Default: 0.10

seed

The random number generator seed used during graph construction. Setting a fixed seed ensures reproducible builds, which can be useful for testing and benchmarking. Default: 0

Lexical Scoring Configuration

LexicalConfig controls how the ItemSearchEngine scores and encodes structured items. These weights determine the relative importance of each field during both indexing and search.

Build-Time Weights

Build-time weights control how much each field contributes to the item vector during indexing:

cfg = brinicle.LexicalConfig()
cfg.build_title_weight = 0.70       # Title importance during build
cfg.build_attr_weight = 0.15        # Attribute importance during build
cfg.build_subcategory_weight = 0.10 # Subcategory importance during build
cfg.build_category_weight = 0.05    # Category importance during build
cfg.build_category_penalty = 0.20   # Penalty for category mismatch

Search-Time Weights

Search-time weights control how much each field contributes to the query encoding and distance calculation:

cfg.search_title_weight = 0.60       # Title importance during search
cfg.search_attr_weight = 0.20        # Attribute importance during search
cfg.search_subcategory_weight = 0.10 # Subcategory importance during search
cfg.search_category_weight = 0.10    # Category importance during search
cfg.search_category_penalty = 0.30   # Penalty for category mismatch

Title Alpha and Beta

These parameters control how title tokens are scored during both build and search:

cfg.build_title_alpha = 0.5   # Build title alpha
cfg.build_title_beta = 0.5    # Build title beta
cfg.search_title_alpha = 0.5  # Search title alpha
cfg.search_title_beta = 0.5   # Search title beta

Autocomplete Scoring Configuration

AutocompleteConfig controls how the AutocompleteEngine scores suggestions:

cfg = brinicle.AutocompleteConfig()
cfg.build_position_decay = 0.5   # Token position decay during build
cfg.build_length_penalty = 0.2   # Length penalty during build
cfg.search_position_decay = 0.5  # Token position decay during search
cfg.search_length_penalty = 0.2  # Length penalty during search

Position Decay

Higher position decay values give more weight to tokens that appear earlier in the suggestion. This is useful for prefix-heavy autocomplete where users typically type from the beginning of a phrase.

Length Penalty

Higher length penalty values penalize longer suggestions, making shorter, more concise suggestions rank higher. This is useful when you want to prefer shorter completions over longer ones.

Index Path Configuration

When you create an engine, the index_path parameter determines where the index files are stored on disk. For an index at path "my_index", the following files are created:

my_index.main     # Main HNSW graph
my_index.delta    # Delta/pending segment
my_index.lock     # Lock file

For the HTTP server, the default data directory is /app/data/. You can change this by modifying the store_dir variable in the server code or by mounting a volume in Docker.

HTTP Server Configuration

The FastAPI server can be configured through environment variables and command-line options:

Setting	Default	Description
Host	0.0.0.0	Server bind address
Port	1984	Server bind port
Data directory	/app/data/	Index storage directory
Memory limit	1GB (Docker)	Container memory limit

Docker Configuration

The docker-compose.yml file provides the default configuration:

services:
  brinicle:
    build: .
    ports:
      - "1984:1984"
    mem_limit: 1g
    volumes:
      - ./data:/app/data

​Configuration

​HNSW Parameters

​M (Connectivity)

​ef_construction (Build-Time Search Width)

​ef_search (Query-Time Search Width)

​delta_ratio

​seed

​Lexical Scoring Configuration

​Build-Time Weights

​Search-Time Weights

​Title Alpha and Beta

​Autocomplete Scoring Configuration

​Position Decay

​Length Penalty

​Index Path Configuration

​HTTP Server Configuration

​Docker Configuration