Skip to content

HeliosDB-Lite Index & SMFI Optimization Roadmap

HeliosDB-Lite Index & SMFI Optimization Roadmap

Status: VALIDATED and APPROVED Date: 2026-01-28 Target: Sub-millisecond point lookups, 2.5M+ rows/sec ingestion with full ACID


Executive Summary

This roadmap addresses the fundamental performance gap in HeliosDB-Lite’s query execution:

  • Current: 150ms point lookups (full table scan every query)
  • Target: <1ms point lookups (index-based direct access)

Validated Requirements

1. Implement ART-based Indexes (Adaptive Radix Tree)

VALIDATED: Research confirms ART is superior to B-tree for HeliosDB’s use case.

MetricB-treeART (Recommended)
Point LookupO(log N)O(k) key length
Space Efficiency20-40 bytes/key8.1 bytes/key
Write PerformanceRebalancing overheadNo rebalancing
Cache BehaviorModerateExcellent
Range ScanGoodExcellent

Why ART over B-tree:

  • Lock-free friendly (matches existing lock-free ingestion)
  • Adaptive node sizing (Node4/Node16/Node48/Node256)
  • Sorted order preserved for range queries
  • Better cache locality for modern CPUs
  • DuckDB, HyPer use ART successfully

Implementation:

src/storage/art_index.rs
pub struct ARTIndex<V> {
root: AtomicPtr<ARTNode>,
node_count: AtomicU64,
memory_usage: AtomicU64,
}
// Node types for adaptive sizing
enum ARTNode {
Node4 { keys: [u8; 4], children: [*mut ARTNode; 4], num_children: u8 },
Node16 { keys: [u8; 16], children: [*mut ARTNode; 16], num_children: u8 },
Node48 { keys: [u8; 256], children: [*mut ARTNode; 48], num_children: u8 },
Node256 { children: [*mut ARTNode; 256] },
Leaf { key: Vec<u8>, value: V },
}

2. Async SMFI Generation and Early Usage

VALIDATED: SMFI structures exist but are applied too late.

Current Flow (WRONG):

Query → scan_table_branch_aware() → LOAD ALL ROWS → scan_with_pushdown() → filter

Required Flow (CORRECT):

Query → Check Index → If no index: Check SMFI → Load only candidate rows → filter

Implementation Plan:

src/storage/smfi_manager.rs
pub struct SMFIManager {
// Bloom filters per table/column
bloom_filters: DashMap<(String, String), BloomFilter>,
// Zone maps per table/column (min/max per block)
zone_maps: DashMap<(String, String), ZoneMap>,
// Background builder
builder_tx: mpsc::Sender<BuildTask>,
// Read-during-rebuild support
partial_filters: DashMap<String, PartialBloomFilter>,
}
impl SMFIManager {
/// Check if value MIGHT exist (for point lookups on non-indexed columns)
pub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain {
// 1. Check if bloom filter exists
if let Some(bf) = self.bloom_filters.get(&(table.into(), column.into())) {
if !bf.might_contain(value) {
return MightContain::DefinitelyNot;
}
}
// 2. Check zone map for range info
if let Some(zm) = self.zone_maps.get(&(table.into(), column.into())) {
if !zm.value_in_range(value) {
return MightContain::DefinitelyNot;
}
// Return candidate blocks
return MightContain::Maybe(zm.get_candidate_blocks(value));
}
MightContain::Unknown
}
/// Build SMFI structures asynchronously
pub async fn build_async(&self, table: &str, schema: &Schema) {
self.builder_tx.send(BuildTask::Full { table: table.into(), schema: schema.clone() }).await;
}
/// Incremental update on INSERT
pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema) {
// Update bloom filters incrementally (O(1))
for (idx, col) in schema.columns.iter().enumerate() {
if let Some(bf) = self.bloom_filters.get_mut(&(table.into(), col.name.clone())) {
bf.insert(&tuple.values[idx]);
}
}
// Update zone maps incrementally
// ...
}
}

3. Auto-Create Indexes on PRIMARY KEY and UNIQUE

VALIDATED: This is essential for O(1) lookups.

Behavior:

  • PRIMARY KEY → Automatic ART index (enforces uniqueness)
  • UNIQUE → Automatic ART index (allows single NULL)
  • Index maintained on INSERT/UPDATE/DELETE

Implementation:

// In src/storage/catalog.rs - create_table()
pub fn create_table(&self, table_name: &str, schema: Schema) -> Result<()> {
// ... existing code ...
// AUTO-CREATE INDEXES
for (idx, col) in schema.columns.iter().enumerate() {
if col.primary_key {
// Create PRIMARY KEY index
let index_name = format!("{}_pk_idx", table_name);
self.storage.create_art_index(
&index_name,
table_name,
&[col.name.clone()],
IndexType::PrimaryKey,
)?;
}
if col.unique && !col.primary_key {
// Create UNIQUE index (allows one NULL)
let index_name = format!("{}_{}_unique_idx", table_name, col.name);
self.storage.create_art_index(
&index_name,
table_name,
&[col.name.clone()],
IndexType::Unique { allow_null: true },
)?;
}
}
Ok(())
}

UNIQUE with NULL handling:

pub enum IndexType {
PrimaryKey, // No NULLs allowed
Unique { allow_null: bool }, // One NULL allowed if true
Secondary, // Multiple duplicates allowed
}
impl ARTIndex {
pub fn insert_unique(&self, key: &[u8], value: u64, allow_null: bool) -> Result<()> {
// Handle NULL case
if key == NULL_MARKER {
if allow_null {
// Check if NULL already exists
if self.null_row_id.is_some() {
return Err(Error::UniqueViolation("NULL already exists"));
}
self.null_row_id = Some(value);
return Ok(());
}
return Err(Error::NullNotAllowed);
}
// Normal unique insert
// ...
}
}

4. Push Predicates INTO Storage Iterator

VALIDATED: This is the critical architectural change.

Current (scan.rs:452):

// PROBLEM: All rows loaded BEFORE filtering
let base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;
storage.predicate_pushdown().scan_with_pushdown(..., base_tuples, ...);

Required Change:

// SOLUTION: Pass predicates TO storage, filter DURING iteration
let tuples = storage.scan_table_with_predicates(
&actual_table_name,
&analyzed_predicates, // Pass predicates to storage
&schema,
)?;

New Storage Method:

src/storage/engine.rs
pub fn scan_table_with_predicates(
&self,
table_name: &str,
predicates: &[AnalyzedPredicate],
schema: &Schema,
) -> Result<Vec<Tuple>> {
// 1. Check for index candidates
for pred in predicates {
if pred.op == PredicateOp::Eq {
// Check if column has an index
if let Some(index) = self.get_index_for_column(table_name, &pred.column_name) {
// DIRECT INDEX LOOKUP - O(1)
if let Some(row_id) = index.get(&pred.value)? {
return self.get_tuple_by_row_id(table_name, row_id);
}
return Ok(vec![]); // Not found
}
}
}
// 2. No index hit - check SMFI
let smfi = self.smfi_manager();
let mut candidate_row_ids: Option<Vec<u64>> = None;
for pred in predicates {
if pred.op == PredicateOp::Eq {
match smfi.might_contain(table_name, &pred.column_name, &pred.value) {
MightContain::DefinitelyNot => return Ok(vec![]), // Early exit!
MightContain::Maybe(blocks) => {
// Only scan these blocks
candidate_row_ids = Some(self.get_row_ids_in_blocks(table_name, &blocks)?);
}
MightContain::Unknown => {} // Fall through to full scan
}
}
}
// 3. Scan only candidate rows (or full scan if no SMFI info)
if let Some(row_ids) = candidate_row_ids {
// Scan only specific rows
self.scan_rows_by_ids(table_name, &row_ids, predicates, schema)
} else {
// Full scan with predicate filtering during iteration
self.scan_table_with_filter(table_name, predicates, schema)
}
}
fn scan_table_with_filter(
&self,
table_name: &str,
predicates: &[AnalyzedPredicate],
schema: &Schema,
) -> Result<Vec<Tuple>> {
let mut results = Vec::new();
let prefix = format!("data:{}:", table_name);
// Iterate with early filtering
for item in self.db.prefix_iterator(prefix.as_bytes()) {
let (key, value) = item?;
// Deserialize
let tuple: Tuple = bincode::deserialize(&value)?;
// FILTER DURING ITERATION (not after)
if self.tuple_matches_predicates(&tuple, predicates, schema) {
results.push(tuple);
// Early termination for LIMIT (if supported)
// ...
}
}
Ok(results)
}

5. SMFI as Fallback for Non-Indexed Columns

VALIDATED: This is the desired behavior.

Query Execution Priority:

1. Check for explicit index on predicate column(s)
└─ YES → Use index for O(1) or O(log k) lookup
└─ NO → Continue to step 2
2. Check SMFI bloom filter
└─ DefinitelyNot → Return empty result immediately
└─ Maybe → Get candidate blocks, scan only those
└─ Unknown → Continue to step 3
3. Check SMFI zone map
└─ Value outside all ranges → Return empty result
└─ Value in some blocks → Scan only those blocks
└─ No zone map → Continue to step 4
4. Full scan with predicate filtering during iteration

Example Query Flow:

SELECT * FROM orders WHERE customer_id = 12345;
1. Check index on `customer_id`
- No explicit index created by user
- Not PRIMARY KEY or UNIQUE
→ No index available
2. Check bloom filter for `orders.customer_id`
- Bloom filter says: might_contain(12345) = true
→ Cannot skip
3. Check zone map for `orders.customer_id`
- Block 0: min=1, max=5000 → SKIP
- Block 1: min=5001, max=15000 → CANDIDATE (contains 12345)
- Block 2: min=15001, max=25000 → SKIP
→ Scan only Block 1
4. Scan Block 1 rows with filter
- Only ~1000 rows instead of 3000
- Apply predicate during iteration
→ Return matching rows

Implementation Roadmap

Phase 1: ART Index Infrastructure (Week 1-2)

Files to Create:

  • src/storage/art_index.rs - Core ART implementation
  • src/storage/art_node.rs - Node types (Node4/16/48/256)
  • src/storage/index_manager.rs - Index lifecycle management

Files to Modify:

  • src/storage/mod.rs - Export new modules
  • src/storage/catalog.rs - Auto-create indexes on PK/UNIQUE
  • src/storage/engine.rs - Index management methods

Key APIs:

// Index creation
pub fn create_art_index(&self, name: &str, table: &str, columns: &[String], index_type: IndexType) -> Result<()>;
// Index lookup
pub fn index_lookup(&self, table: &str, column: &str, value: &Value) -> Result<Option<u64>>;
// Index maintenance
pub fn index_insert(&self, table: &str, column: &str, value: &Value, row_id: u64) -> Result<()>;
pub fn index_delete(&self, table: &str, column: &str, value: &Value) -> Result<()>;

Phase 2: Async SMFI Manager (Week 2-3)

Files to Create:

  • src/storage/smfi_manager.rs - Unified SMFI management
  • src/storage/smfi_builder.rs - Background builder worker

Files to Modify:

  • src/storage/predicate_pushdown.rs - Integrate with SMFI manager
  • src/storage/engine.rs - SMFI hooks on INSERT/UPDATE/DELETE

Key APIs:

// SMFI queries
pub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain;
pub fn get_candidate_blocks(&self, table: &str, column: &str, value: &Value) -> Vec<BlockId>;
// Async building
pub async fn build_smfi_async(&self, table: &str);
pub fn trigger_rebuild(&self, table: &str);
// Incremental updates
pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);
pub fn on_delete(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);

Phase 3: Predicate Pushdown Rewrite (Week 3-4)

Files to Modify:

  • src/sql/executor/scan.rs - Pass predicates to storage
  • src/storage/engine.rs - New scan_table_with_predicates() method
  • src/optimizer/rules.rs - Optimize predicate ordering

Key Changes:

// BEFORE (scan.rs:452)
let base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;
storage.predicate_pushdown().scan_with_pushdown(...);
// AFTER
let tuples = storage.scan_table_with_predicates(
&actual_table_name,
&analyzed_predicates,
&schema,
)?;

Phase 4: Integration & Testing (Week 4-5)

Test Cases:

  1. Point lookup on PRIMARY KEY → <1ms
  2. Point lookup on UNIQUE → <1ms
  3. Point lookup on non-indexed with SMFI → <10ms (block skip)
  4. Range query with zone map → Block skipping verified
  5. INSERT maintains all indexes → No performance regression
  6. Concurrent access → Lock-free correctness

Benchmarks:

Terminal window
# Point lookup benchmark
cargo bench --bench index_lookup -- point_lookup
# Expected results:
# - 100K rows: <1ms (was 150ms)
# - 1M rows: <1ms (was 1500ms)
# - 10M rows: <2ms (would be 15000ms)

Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│ Query Executor │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Parser │ → │ Planner │ → │ Predicate Analyzer │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Index/SMFI Lookup │ │
│ │ │ │
│ │ 1. Check ART Index (PK/UNIQUE) │ │
│ │ └─ O(k) direct lookup │ │
│ │ │ │
│ │ 2. Check SMFI Bloom Filter │ │
│ │ └─ Definitely not? → Return empty │ │
│ │ │ │
│ │ 3. Check SMFI Zone Map │ │
│ │ └─ Get candidate blocks │ │
│ │ │ │
│ │ 4. Filtered Scan (last resort) │ │
│ │ └─ Filter during iteration │ │
│ └─────────────────────────────────────────┘ │
│ │ │
└──────────────────────────────────────────────│──────────────────┘
┌──────────────────────────────────────────────▼──────────────────┐
│ Storage Engine │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐ │
│ │ ART Indexes │ │ SMFI Manager │ │ RocksDB │ │
│ │ │ │ │ │ (LSM-tree) │ │
│ │ - PK Index │ │ - Bloom Filters │ │ │ │
│ │ - UNIQUE Index │ │ - Zone Maps │ │ - Persistence │ │
│ │ - Secondary │ │ - Async Build │ │ - Compression │ │
│ └─────────────────┘ └─────────────────┘ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Success Metrics

MetricCurrentTargetImprovement
Point lookup (PK)150ms<1ms150x
Point lookup (UNIQUE)150ms<1ms150x
Point lookup (non-indexed)150ms<10ms15x
Range query with zone map150ms<5ms30x
INSERT throughput1,280/s2,500,000/s1,953x
Index memory overhead0~10 bytes/rowN/A

Risk Mitigation

  1. ART Implementation Complexity

    • Mitigation: Use proven Rust libraries (congee for concurrent ART)
    • Fallback: Simple B-tree if ART proves too complex
  2. Index Maintenance Overhead

    • Mitigation: Batch index updates during bulk load
    • Use bulk_load_mode to defer index updates
  3. Memory Usage

    • Mitigation: ART is space-efficient (8.1 bytes/key)
    • Add memory budget configuration
  4. Backward Compatibility

    • Existing tables without indexes continue to work
    • Auto-create indexes only on new tables (migration path for existing)

Appendix: Research Sources