HeliosDB-Lite Index & SMFI Optimization Roadmap
HeliosDB-Lite Index & SMFI Optimization Roadmap
Status: VALIDATED and APPROVED Date: 2026-01-28 Target: Sub-millisecond point lookups, 2.5M+ rows/sec ingestion with full ACID
Executive Summary
This roadmap addresses the fundamental performance gap in HeliosDB-Lite’s query execution:
- Current: 150ms point lookups (full table scan every query)
- Target: <1ms point lookups (index-based direct access)
Validated Requirements
1. Implement ART-based Indexes (Adaptive Radix Tree)
VALIDATED: Research confirms ART is superior to B-tree for HeliosDB’s use case.
| Metric | B-tree | ART (Recommended) |
|---|---|---|
| Point Lookup | O(log N) | O(k) key length |
| Space Efficiency | 20-40 bytes/key | 8.1 bytes/key |
| Write Performance | Rebalancing overhead | No rebalancing |
| Cache Behavior | Moderate | Excellent |
| Range Scan | Good | Excellent |
Why ART over B-tree:
- Lock-free friendly (matches existing lock-free ingestion)
- Adaptive node sizing (Node4/Node16/Node48/Node256)
- Sorted order preserved for range queries
- Better cache locality for modern CPUs
- DuckDB, HyPer use ART successfully
Implementation:
pub struct ARTIndex<V> { root: AtomicPtr<ARTNode>, node_count: AtomicU64, memory_usage: AtomicU64,}
// Node types for adaptive sizingenum ARTNode { Node4 { keys: [u8; 4], children: [*mut ARTNode; 4], num_children: u8 }, Node16 { keys: [u8; 16], children: [*mut ARTNode; 16], num_children: u8 }, Node48 { keys: [u8; 256], children: [*mut ARTNode; 48], num_children: u8 }, Node256 { children: [*mut ARTNode; 256] }, Leaf { key: Vec<u8>, value: V },}2. Async SMFI Generation and Early Usage
VALIDATED: SMFI structures exist but are applied too late.
Current Flow (WRONG):
Query → scan_table_branch_aware() → LOAD ALL ROWS → scan_with_pushdown() → filterRequired Flow (CORRECT):
Query → Check Index → If no index: Check SMFI → Load only candidate rows → filterImplementation Plan:
pub struct SMFIManager { // Bloom filters per table/column bloom_filters: DashMap<(String, String), BloomFilter>,
// Zone maps per table/column (min/max per block) zone_maps: DashMap<(String, String), ZoneMap>,
// Background builder builder_tx: mpsc::Sender<BuildTask>,
// Read-during-rebuild support partial_filters: DashMap<String, PartialBloomFilter>,}
impl SMFIManager { /// Check if value MIGHT exist (for point lookups on non-indexed columns) pub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain { // 1. Check if bloom filter exists if let Some(bf) = self.bloom_filters.get(&(table.into(), column.into())) { if !bf.might_contain(value) { return MightContain::DefinitelyNot; } }
// 2. Check zone map for range info if let Some(zm) = self.zone_maps.get(&(table.into(), column.into())) { if !zm.value_in_range(value) { return MightContain::DefinitelyNot; } // Return candidate blocks return MightContain::Maybe(zm.get_candidate_blocks(value)); }
MightContain::Unknown }
/// Build SMFI structures asynchronously pub async fn build_async(&self, table: &str, schema: &Schema) { self.builder_tx.send(BuildTask::Full { table: table.into(), schema: schema.clone() }).await; }
/// Incremental update on INSERT pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema) { // Update bloom filters incrementally (O(1)) for (idx, col) in schema.columns.iter().enumerate() { if let Some(bf) = self.bloom_filters.get_mut(&(table.into(), col.name.clone())) { bf.insert(&tuple.values[idx]); } }
// Update zone maps incrementally // ... }}3. Auto-Create Indexes on PRIMARY KEY and UNIQUE
VALIDATED: This is essential for O(1) lookups.
Behavior:
PRIMARY KEY→ Automatic ART index (enforces uniqueness)UNIQUE→ Automatic ART index (allows single NULL)- Index maintained on INSERT/UPDATE/DELETE
Implementation:
// In src/storage/catalog.rs - create_table()pub fn create_table(&self, table_name: &str, schema: Schema) -> Result<()> { // ... existing code ...
// AUTO-CREATE INDEXES for (idx, col) in schema.columns.iter().enumerate() { if col.primary_key { // Create PRIMARY KEY index let index_name = format!("{}_pk_idx", table_name); self.storage.create_art_index( &index_name, table_name, &[col.name.clone()], IndexType::PrimaryKey, )?; }
if col.unique && !col.primary_key { // Create UNIQUE index (allows one NULL) let index_name = format!("{}_{}_unique_idx", table_name, col.name); self.storage.create_art_index( &index_name, table_name, &[col.name.clone()], IndexType::Unique { allow_null: true }, )?; } }
Ok(())}UNIQUE with NULL handling:
pub enum IndexType { PrimaryKey, // No NULLs allowed Unique { allow_null: bool }, // One NULL allowed if true Secondary, // Multiple duplicates allowed}
impl ARTIndex { pub fn insert_unique(&self, key: &[u8], value: u64, allow_null: bool) -> Result<()> { // Handle NULL case if key == NULL_MARKER { if allow_null { // Check if NULL already exists if self.null_row_id.is_some() { return Err(Error::UniqueViolation("NULL already exists")); } self.null_row_id = Some(value); return Ok(()); } return Err(Error::NullNotAllowed); }
// Normal unique insert // ... }}4. Push Predicates INTO Storage Iterator
VALIDATED: This is the critical architectural change.
Current (scan.rs:452):
// PROBLEM: All rows loaded BEFORE filteringlet base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;storage.predicate_pushdown().scan_with_pushdown(..., base_tuples, ...);Required Change:
// SOLUTION: Pass predicates TO storage, filter DURING iterationlet tuples = storage.scan_table_with_predicates( &actual_table_name, &analyzed_predicates, // Pass predicates to storage &schema,)?;New Storage Method:
pub fn scan_table_with_predicates( &self, table_name: &str, predicates: &[AnalyzedPredicate], schema: &Schema,) -> Result<Vec<Tuple>> { // 1. Check for index candidates for pred in predicates { if pred.op == PredicateOp::Eq { // Check if column has an index if let Some(index) = self.get_index_for_column(table_name, &pred.column_name) { // DIRECT INDEX LOOKUP - O(1) if let Some(row_id) = index.get(&pred.value)? { return self.get_tuple_by_row_id(table_name, row_id); } return Ok(vec![]); // Not found } } }
// 2. No index hit - check SMFI let smfi = self.smfi_manager(); let mut candidate_row_ids: Option<Vec<u64>> = None;
for pred in predicates { if pred.op == PredicateOp::Eq { match smfi.might_contain(table_name, &pred.column_name, &pred.value) { MightContain::DefinitelyNot => return Ok(vec![]), // Early exit! MightContain::Maybe(blocks) => { // Only scan these blocks candidate_row_ids = Some(self.get_row_ids_in_blocks(table_name, &blocks)?); } MightContain::Unknown => {} // Fall through to full scan } } }
// 3. Scan only candidate rows (or full scan if no SMFI info) if let Some(row_ids) = candidate_row_ids { // Scan only specific rows self.scan_rows_by_ids(table_name, &row_ids, predicates, schema) } else { // Full scan with predicate filtering during iteration self.scan_table_with_filter(table_name, predicates, schema) }}
fn scan_table_with_filter( &self, table_name: &str, predicates: &[AnalyzedPredicate], schema: &Schema,) -> Result<Vec<Tuple>> { let mut results = Vec::new(); let prefix = format!("data:{}:", table_name);
// Iterate with early filtering for item in self.db.prefix_iterator(prefix.as_bytes()) { let (key, value) = item?;
// Deserialize let tuple: Tuple = bincode::deserialize(&value)?;
// FILTER DURING ITERATION (not after) if self.tuple_matches_predicates(&tuple, predicates, schema) { results.push(tuple);
// Early termination for LIMIT (if supported) // ... } }
Ok(results)}5. SMFI as Fallback for Non-Indexed Columns
VALIDATED: This is the desired behavior.
Query Execution Priority:
1. Check for explicit index on predicate column(s) └─ YES → Use index for O(1) or O(log k) lookup └─ NO → Continue to step 2
2. Check SMFI bloom filter └─ DefinitelyNot → Return empty result immediately └─ Maybe → Get candidate blocks, scan only those └─ Unknown → Continue to step 3
3. Check SMFI zone map └─ Value outside all ranges → Return empty result └─ Value in some blocks → Scan only those blocks └─ No zone map → Continue to step 4
4. Full scan with predicate filtering during iterationExample Query Flow:
SELECT * FROM orders WHERE customer_id = 12345;1. Check index on `customer_id` - No explicit index created by user - Not PRIMARY KEY or UNIQUE → No index available
2. Check bloom filter for `orders.customer_id` - Bloom filter says: might_contain(12345) = true → Cannot skip
3. Check zone map for `orders.customer_id` - Block 0: min=1, max=5000 → SKIP - Block 1: min=5001, max=15000 → CANDIDATE (contains 12345) - Block 2: min=15001, max=25000 → SKIP → Scan only Block 1
4. Scan Block 1 rows with filter - Only ~1000 rows instead of 3000 - Apply predicate during iteration → Return matching rowsImplementation Roadmap
Phase 1: ART Index Infrastructure (Week 1-2)
Files to Create:
src/storage/art_index.rs- Core ART implementationsrc/storage/art_node.rs- Node types (Node4/16/48/256)src/storage/index_manager.rs- Index lifecycle management
Files to Modify:
src/storage/mod.rs- Export new modulessrc/storage/catalog.rs- Auto-create indexes on PK/UNIQUEsrc/storage/engine.rs- Index management methods
Key APIs:
// Index creationpub fn create_art_index(&self, name: &str, table: &str, columns: &[String], index_type: IndexType) -> Result<()>;
// Index lookuppub fn index_lookup(&self, table: &str, column: &str, value: &Value) -> Result<Option<u64>>;
// Index maintenancepub fn index_insert(&self, table: &str, column: &str, value: &Value, row_id: u64) -> Result<()>;pub fn index_delete(&self, table: &str, column: &str, value: &Value) -> Result<()>;Phase 2: Async SMFI Manager (Week 2-3)
Files to Create:
src/storage/smfi_manager.rs- Unified SMFI managementsrc/storage/smfi_builder.rs- Background builder worker
Files to Modify:
src/storage/predicate_pushdown.rs- Integrate with SMFI managersrc/storage/engine.rs- SMFI hooks on INSERT/UPDATE/DELETE
Key APIs:
// SMFI queriespub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain;pub fn get_candidate_blocks(&self, table: &str, column: &str, value: &Value) -> Vec<BlockId>;
// Async buildingpub async fn build_smfi_async(&self, table: &str);pub fn trigger_rebuild(&self, table: &str);
// Incremental updatespub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);pub fn on_delete(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);Phase 3: Predicate Pushdown Rewrite (Week 3-4)
Files to Modify:
src/sql/executor/scan.rs- Pass predicates to storagesrc/storage/engine.rs- Newscan_table_with_predicates()methodsrc/optimizer/rules.rs- Optimize predicate ordering
Key Changes:
// BEFORE (scan.rs:452)let base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;storage.predicate_pushdown().scan_with_pushdown(...);
// AFTERlet tuples = storage.scan_table_with_predicates( &actual_table_name, &analyzed_predicates, &schema,)?;Phase 4: Integration & Testing (Week 4-5)
Test Cases:
- Point lookup on PRIMARY KEY → <1ms
- Point lookup on UNIQUE → <1ms
- Point lookup on non-indexed with SMFI → <10ms (block skip)
- Range query with zone map → Block skipping verified
- INSERT maintains all indexes → No performance regression
- Concurrent access → Lock-free correctness
Benchmarks:
# Point lookup benchmarkcargo bench --bench index_lookup -- point_lookup
# Expected results:# - 100K rows: <1ms (was 150ms)# - 1M rows: <1ms (was 1500ms)# - 10M rows: <2ms (would be 15000ms)Architecture Diagram
┌─────────────────────────────────────────────────────────────────┐│ Query Executor │├─────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ ││ │ Parser │ → │ Planner │ → │ Predicate Analyzer │ ││ └─────────────┘ └─────────────┘ └─────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────┐ ││ │ Index/SMFI Lookup │ ││ │ │ ││ │ 1. Check ART Index (PK/UNIQUE) │ ││ │ └─ O(k) direct lookup │ ││ │ │ ││ │ 2. Check SMFI Bloom Filter │ ││ │ └─ Definitely not? → Return empty │ ││ │ │ ││ │ 3. Check SMFI Zone Map │ ││ │ └─ Get candidate blocks │ ││ │ │ ││ │ 4. Filtered Scan (last resort) │ ││ │ └─ Filter during iteration │ ││ └─────────────────────────────────────────┘ ││ │ │└──────────────────────────────────────────────│──────────────────┘ │┌──────────────────────────────────────────────▼──────────────────┐│ Storage Engine │├─────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐ ││ │ ART Indexes │ │ SMFI Manager │ │ RocksDB │ ││ │ │ │ │ │ (LSM-tree) │ ││ │ - PK Index │ │ - Bloom Filters │ │ │ ││ │ - UNIQUE Index │ │ - Zone Maps │ │ - Persistence │ ││ │ - Secondary │ │ - Async Build │ │ - Compression │ ││ └─────────────────┘ └─────────────────┘ └────────────────┘ ││ │└─────────────────────────────────────────────────────────────────┘Success Metrics
| Metric | Current | Target | Improvement |
|---|---|---|---|
| Point lookup (PK) | 150ms | <1ms | 150x |
| Point lookup (UNIQUE) | 150ms | <1ms | 150x |
| Point lookup (non-indexed) | 150ms | <10ms | 15x |
| Range query with zone map | 150ms | <5ms | 30x |
| INSERT throughput | 1,280/s | 2,500,000/s | 1,953x |
| Index memory overhead | 0 | ~10 bytes/row | N/A |
Risk Mitigation
-
ART Implementation Complexity
- Mitigation: Use proven Rust libraries (
congeefor concurrent ART) - Fallback: Simple B-tree if ART proves too complex
- Mitigation: Use proven Rust libraries (
-
Index Maintenance Overhead
- Mitigation: Batch index updates during bulk load
- Use
bulk_load_modeto defer index updates
-
Memory Usage
- Mitigation: ART is space-efficient (8.1 bytes/key)
- Add memory budget configuration
-
Backward Compatibility
- Existing tables without indexes continue to work
- Auto-create indexes only on new tables (migration path for existing)