Deployment
brinicle can be deployed in several ways depending on your architecture and requirements. This guide covers the most common deployment patterns.Docker Deployment
The simplest way to deploy brinicle as a service is using Docker. The repository includes aDockerfile and docker-compose.yml for production-ready deployment.
Using Docker Compose
/app/data/ directory inside the container.
Custom Configuration
Modify thedocker-compose.yml for your needs:
Persistent Data
To persist index data across container restarts, mount a volume:Manual Deployment
If you prefer to deploy without Docker, you can install and run the server directly.Install Dependencies
Run the Server
Systemd Service
Create a systemd service file for production deployment:Nginx Reverse Proxy
For production, it’s recommended to place brinicle behind a reverse proxy that handles TLS termination, authentication, and rate limiting:Embedded Deployment
For applications that don’t need a separate server, you can embed brinicle directly in your Python process:Scaling Considerations
Single Instance
brinicle is designed for single-instance deployments with datasets under 10M vectors. For most use cases, a single instance is sufficient and provides the simplest operational model.Memory Planning
When planning memory for your deployment, consider:- Index size — brinicle is disk-first, but it still needs some RAM for the delta segment and search buffers
- Concurrent queries — more concurrent queries require more RAM for search buffers
- Delta ratio — a higher delta_ratio means more RAM for the delta segment
Index Sharding
For datasets larger than 10M vectors, consider sharding your data across multiple brinicle instances, each handling a subset of the data. You can implement a simple routing layer that directs queries to the appropriate shard based on the data partitioning strategy.Health Monitoring
The HTTP server provides a health check endpoint that returns the number of loaded indexes:- A successful response indicates the server is running
- The message includes the number of loaded indexes for operational awareness