Search Operations

The search endpoint allows you to find the nearest neighbors to a query vector using the HNSW index. Search uses a binary protocol for maximum performance. Search for the k nearest neighbors to a query vector.
POST /search.bin?index_name=my_index&k=10&efs=64
Content-Type: application/octet-stream

Query Parameters

ParameterTypeRequiredDefaultDescription
index_namestringYesName of the index to search
kintegerNo10Number of nearest neighbors to return
efsintegerNo64Search-time ef (controls recall vs. speed)

Binary Request Body

The request body contains the query vector as float32 little-endian values:
[dim=4 example]
[0.1f32] [0.2f32] [0.3f32] [0.4f32]
For a 384-dimensional index, the request body is 384 * 4 = 1536 bytes.

Response

The response is a JSON array of external IDs, ordered by similarity (nearest first):
["item_042", "item_137", "item_891", "item_003", "item_256"]

Example with curl

# Create a binary query vector file (4 dimensions)
python3 -c "
import struct, sys
vals = [0.1, 0.2, 0.3, 0.4]
sys.stdout.buffer.write(struct.pack(f'<{len(vals)}f', *vals))
" > query.bin

# Search
curl -X POST "http://localhost:1984/search.bin?index_name=my_index&k=5" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @query.bin

Example with Python

import numpy as np
import requests

query = np.random.randn(384).astype(np.float32)

response = requests.post(
    "http://localhost:1984/search.bin",
    params={"index_name": "my_index", "k": 10, "efs": 64},
    data=query.tobytes(),
    headers={"Content-Type": "application/octet-stream"},
)

results = response.json()
print(results)  # ["item_42", "item_137", ...]

Search Parameters Explained

  • k — The number of results to return. Higher values return more neighbors but take slightly longer. Typical values are 5-100.
  • efs — The search-time ef parameter controls the trade-off between recall and speed. Higher values give better recall (more accurate results) but slower queries. The efs parameter overrides the ef_search value set during index creation for this specific query.

Choosing efs

efs ValueUse Case
32-64Fast, approximate results; good for autocomplete or exploratory search
64-128Balanced speed and recall; good for most production use cases
128-512High recall; good for applications where accuracy is critical
512+Maximum recall; use when you need the best possible results and can tolerate slower queries