Document Store: Performance Optimization
Document Store: Performance Optimization
Part of: HeliosDB Document Store User Guide
Query Optimization
1. Use Indexes:
// Slow (full collection scan)collection.find(doc! { "email": "alice@example.com" }, None).await?; // 45ms
// Fast (indexed lookup)// After: create_index({ "email": 1 })collection.find(doc! { "email": "alice@example.com" }, None).await?; // 2.8ms2. Use Projections (reduce data transfer):
// Fetch only needed fieldslet options = FindOptions::builder() .projection(doc! { "name": 1, "email": 1, "_id": 0 }) .build();
collection.find(doc! {}, options).await?; // 40% faster3. Use Covered Queries (query entirely from index):
// Create compound indexcreate_index({ "status": 1, "email": 1 })
// Query uses only indexed fields + projectionlet options = FindOptions::builder() .projection(doc! { "status": 1, "email": 1, "_id": 0 }) .build();
collection.find(doc! { "status": "active" }, options).await?; // 90% faster4. Limit Result Sets:
// Don't fetch everythinglet options = FindOptions::builder() .limit(20) // Pagination .skip(0) .build();
collection.find(doc! {}, options).await?;Index Selection Guidelines
When to Index:
- Fields used in WHERE clauses frequently
- Fields used in ORDER BY
- Fields used in GROUP BY
- Foreign keys (for joins)
- Unique fields (email, username)
When NOT to Index:
- ❌ Low-cardinality fields (boolean, status with 2-3 values)
- ❌ Rarely queried fields
- ❌ Fields with high write:read ratio
- ❌ Very large text fields (use text index instead)
Index Overhead:
- Write performance: -10% to -30% (more indexes = slower writes)
- Storage: +5% to +50% (depends on index types)
- Memory: Indexes cached in RAM for performance
Compression
Enable Compression for Documents > 1KB:
use heliosdb_document::{InsertOptions, CompressionCodec};
let options = InsertOptions { compression: Some(CompressionCodec::Zstd), // 60% compression ..Default::default()};
store.insert_with_options(&collection, &id, data, &options)?;Compression Codecs:
| Codec | Compression Ratio | Speed | Use Case |
|---|---|---|---|
| Zstd | 60-70% | Fast | General purpose (recommended) |
| LZ4 | 50-60% | Fastest | Low-latency queries |
| Snappy | 40-50% | Fast | Balanced |
Storage Savings Example:
100K documents × 5KB avg = 500MB uncompressedWith Zstd: 500MB → 175MB (65% reduction)Batch Operations
Batch Inserts (120K docs/sec):
// Slow: Insert one by onefor doc in documents { collection.insert_one(doc, None).await?; // 15K docs/sec}
// Fast: Batch insertcollection.insert_many(documents, None).await?; // 120K docs/sec (8x faster)Batch Size Recommendations:
- Small docs (<1KB): Batches of 500-1000
- Medium docs (1-10KB): Batches of 100-500
- Large docs (>10KB): Batches of 10-100
Connection Pooling
Use Connection Pools:
use mongodb::options::ClientOptions;
let mut options = ClientOptions::parse("mongodb://localhost:27017").await?;options.max_pool_size = Some(100); // Default: 10options.min_pool_size = Some(10); // Keep connections warm
let client = Client::with_options(options)?;Pool Size Guidelines:
| Workload | Min Pool | Max Pool |
|---|---|---|
| Low traffic | 5 | 20 |
| Medium traffic | 10 | 50 |
| High traffic | 20 | 100 |
| Very high traffic | 50 | 200 |
Query Performance Checklist
Always do:
- Use indexes for frequently queried fields
- Use projections to reduce data transfer
- Use pagination (limit + skip)
- Monitor slow queries (>10ms)
- Use batch operations for bulk inserts
❌ Avoid:
- Full collection scans on large collections
- Regex without anchors (^, $)
- Fetching entire large documents when only few fields needed
- Creating too many indexes (>15 per collection)
- Using $where (JavaScript expressions)
Performance Benchmarks
Query Latency (P99):
| Operation | Latency | Notes |
|---|---|---|
| Get by ID (indexed) | 2.8ms | Single document |
| Find with filter (indexed) | 4.2ms | 100 docs |
| Find without index | 45ms | 1K docs, full scan |
| Aggregation (group) | 12ms | 1K docs |
| Text search | 8.5ms | 1K docs, text index |
| Geospatial query | 5ms | 1K docs, 2dsphere index |
Write Throughput:
| Operation | Throughput | Notes |
|---|---|---|
| Insert (batch) | 120K docs/sec | Batches of 100 |
| Insert (single) | 15K docs/sec | Individual inserts |
| Update | 25K ops/sec | Single field |
| Delete | 30K ops/sec | Soft delete |
Navigation: ← Previous: Use Cases | Back to Index | Next: Integration →