Insert and Upsert

After the initial build, you can add new vectors to an existing index using insert mode or update existing vectors using upsert mode. These operations allow you to keep your index up-to-date without rebuilding from scratch.

Insert Mode

Use insert mode to add new vectors to an existing index.
import numpy as np
import brinicle

# Load existing index
engine = brinicle.VectorEngine("vector_index", dim=384)

# Start an insert session
engine.init(mode="insert")

# Add new vectors
new_vectors = np.random.randn(100, 384).astype(np.float32)
for i, vec in enumerate(new_vectors):
    engine.ingest(f"new_{i}", vec)

# Finalize the insert
engine.finalize()

# Search now includes both old and new vectors
query = np.random.randn(384).astype(np.float32)
results = engine.search(query, k=10)
Inserted records are added through the delta index. This allows Brinicle to accept updates without rebuilding the full main index after every insert.

When to Use Insert

Use insert mode when you want to add entirely new vectors to an existing index. Common scenarios include:
  • Adding new documents or items as they become available
  • Periodic batch updates from a data pipeline
  • Incremental indexing of streaming data

Upsert Mode

Use upsert mode to replace existing records or insert new records.
import numpy as np
import brinicle

engine = brinicle.VectorEngine("vector_index", dim=384)

# Start an upsert session
engine.init(mode="upsert")

# Update existing vector or add new one
updated_vector = np.random.randn(384).astype(np.float32)
engine.ingest("existing_id", updated_vector)

# Add a completely new vector
new_vector = np.random.randn(384).astype(np.float32)
engine.ingest("brand_new_id", new_vector)

engine.finalize()
If "existing_id" already exists, Brinicle marks the old record as deleted and inserts the new version. If "existing_id" does not exist, it is inserted as a new record.

When to Use Upsert

Use upsert mode when you want to update existing vectors or add new ones in a single operation. Common scenarios include:
  • Refreshing embeddings for existing items (e.g., after re-encoding with a new model)
  • Syncing an index with a source of truth where items may be updated
  • Correcting errors in previously ingested vectors

Performance Considerations

Both insert and upsert operations add vectors to the delta segment of the index. This means:
  1. New vectors are immediately searchable after finalize() — no need to rebuild the entire index
  2. The delta segment grows with each insert/upsert cycle, which may gradually increase query latency
  3. Periodic rebuild using rebuild_compact() can merge the delta segment back into the main segment for optimal performance

Delta Segment and Rebuilds

The delta_ratio parameter controls how large the delta segment can grow relative to the main segment. When the delta segment exceeds this ratio, search performance may degrade. You can check if a rebuild is needed and perform one:
if engine.needs_rebuild():
    engine.rebuild_compact(M=48, ef_construction=1024, ef_search=512)
For production systems with frequent updates, it’s recommended to schedule periodic rebuilds during low-traffic periods to maintain optimal search performance.