HeliosDB-Lite Index & SMFI Optimization Roadmap

Status: VALIDATED and APPROVED Date: 2026-01-28 Target: Sub-millisecond point lookups, 2.5M+ rows/sec ingestion with full ACID

Executive Summary

This roadmap addresses the fundamental performance gap in HeliosDB-Lite’s query execution:

Current: 150ms point lookups (full table scan every query)
Target: <1ms point lookups (index-based direct access)

Validated Requirements

1. Implement ART-based Indexes (Adaptive Radix Tree)

VALIDATED: Research confirms ART is superior to B-tree for HeliosDB’s use case.

Metric	B-tree	ART (Recommended)
Point Lookup	O(log N)	O(k) key length
Space Efficiency	20-40 bytes/key	8.1 bytes/key
Write Performance	Rebalancing overhead	No rebalancing
Cache Behavior	Moderate	Excellent
Range Scan	Good	Excellent

Why ART over B-tree:

Lock-free friendly (matches existing lock-free ingestion)
Adaptive node sizing (Node4/Node16/Node48/Node256)
Sorted order preserved for range queries
Better cache locality for modern CPUs
DuckDB, HyPer use ART successfully

Implementation:

pub struct ARTIndex<V> {
    root: AtomicPtr<ARTNode>,
    node_count: AtomicU64,
    memory_usage: AtomicU64,
}

// Node types for adaptive sizing
enum ARTNode {
    Node4 { keys: [u8; 4], children: [*mut ARTNode; 4], num_children: u8 },
    Node16 { keys: [u8; 16], children: [*mut ARTNode; 16], num_children: u8 },
    Node48 { keys: [u8; 256], children: [*mut ARTNode; 48], num_children: u8 },
    Node256 { children: [*mut ARTNode; 256] },
    Leaf { key: Vec<u8>, value: V },
}

2. Async SMFI Generation and Early Usage

VALIDATED: SMFI structures exist but are applied too late.

Current Flow (WRONG):

Query → scan_table_branch_aware() → LOAD ALL ROWS → scan_with_pushdown() → filter

Required Flow (CORRECT):

Query → Check Index → If no index: Check SMFI → Load only candidate rows → filter

Implementation Plan:

pub struct SMFIManager {
    // Bloom filters per table/column
    bloom_filters: DashMap<(String, String), BloomFilter>,

    // Zone maps per table/column (min/max per block)
    zone_maps: DashMap<(String, String), ZoneMap>,

    // Background builder
    builder_tx: mpsc::Sender<BuildTask>,

    // Read-during-rebuild support
    partial_filters: DashMap<String, PartialBloomFilter>,
}

impl SMFIManager {
    /// Check if value MIGHT exist (for point lookups on non-indexed columns)
    pub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain {
        // 1. Check if bloom filter exists
        if let Some(bf) = self.bloom_filters.get(&(table.into(), column.into())) {
            if !bf.might_contain(value) {
                return MightContain::DefinitelyNot;
            }
        }

        // 2. Check zone map for range info
        if let Some(zm) = self.zone_maps.get(&(table.into(), column.into())) {
            if !zm.value_in_range(value) {
                return MightContain::DefinitelyNot;
            }
            // Return candidate blocks
            return MightContain::Maybe(zm.get_candidate_blocks(value));
        }

        MightContain::Unknown
    }

    /// Build SMFI structures asynchronously
    pub async fn build_async(&self, table: &str, schema: &Schema) {
        self.builder_tx.send(BuildTask::Full { table: table.into(), schema: schema.clone() }).await;
    }

    /// Incremental update on INSERT
    pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema) {
        // Update bloom filters incrementally (O(1))
        for (idx, col) in schema.columns.iter().enumerate() {
            if let Some(bf) = self.bloom_filters.get_mut(&(table.into(), col.name.clone())) {
                bf.insert(&tuple.values[idx]);
            }
        }

        // Update zone maps incrementally
        // ...
    }
}

3. Auto-Create Indexes on PRIMARY KEY and UNIQUE

VALIDATED: This is essential for O(1) lookups.

Behavior:

PRIMARY KEY → Automatic ART index (enforces uniqueness)
UNIQUE → Automatic ART index (allows single NULL)
Index maintained on INSERT/UPDATE/DELETE

Implementation:

// In src/storage/catalog.rs - create_table()
pub fn create_table(&self, table_name: &str, schema: Schema) -> Result<()> {
    // ... existing code ...

    // AUTO-CREATE INDEXES
    for (idx, col) in schema.columns.iter().enumerate() {
        if col.primary_key {
            // Create PRIMARY KEY index
            let index_name = format!("{}_pk_idx", table_name);
            self.storage.create_art_index(
                &index_name,
                table_name,
                &[col.name.clone()],
                IndexType::PrimaryKey,
            )?;
        }

        if col.unique && !col.primary_key {
            // Create UNIQUE index (allows one NULL)
            let index_name = format!("{}_{}_unique_idx", table_name, col.name);
            self.storage.create_art_index(
                &index_name,
                table_name,
                &[col.name.clone()],
                IndexType::Unique { allow_null: true },
            )?;
        }
    }

    Ok(())
}

UNIQUE with NULL handling:

pub enum IndexType {
    PrimaryKey,           // No NULLs allowed
    Unique { allow_null: bool },  // One NULL allowed if true
    Secondary,            // Multiple duplicates allowed
}

impl ARTIndex {
    pub fn insert_unique(&self, key: &[u8], value: u64, allow_null: bool) -> Result<()> {
        // Handle NULL case
        if key == NULL_MARKER {
            if allow_null {
                // Check if NULL already exists
                if self.null_row_id.is_some() {
                    return Err(Error::UniqueViolation("NULL already exists"));
                }
                self.null_row_id = Some(value);
                return Ok(());
            }
            return Err(Error::NullNotAllowed);
        }

        // Normal unique insert
        // ...
    }
}

4. Push Predicates INTO Storage Iterator

VALIDATED: This is the critical architectural change.

Current (scan.rs:452):

// PROBLEM: All rows loaded BEFORE filtering
let base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;
storage.predicate_pushdown().scan_with_pushdown(..., base_tuples, ...);

Required Change:

// SOLUTION: Pass predicates TO storage, filter DURING iteration
let tuples = storage.scan_table_with_predicates(
    &actual_table_name,
    &analyzed_predicates,  // Pass predicates to storage
    &schema,
)?;

New Storage Method:

pub fn scan_table_with_predicates(
    &self,
    table_name: &str,
    predicates: &[AnalyzedPredicate],
    schema: &Schema,
) -> Result<Vec<Tuple>> {
    // 1. Check for index candidates
    for pred in predicates {
        if pred.op == PredicateOp::Eq {
            // Check if column has an index
            if let Some(index) = self.get_index_for_column(table_name, &pred.column_name) {
                // DIRECT INDEX LOOKUP - O(1)
                if let Some(row_id) = index.get(&pred.value)? {
                    return self.get_tuple_by_row_id(table_name, row_id);
                }
                return Ok(vec![]); // Not found
            }
        }
    }

    // 2. No index hit - check SMFI
    let smfi = self.smfi_manager();
    let mut candidate_row_ids: Option<Vec<u64>> = None;

    for pred in predicates {
        if pred.op == PredicateOp::Eq {
            match smfi.might_contain(table_name, &pred.column_name, &pred.value) {
                MightContain::DefinitelyNot => return Ok(vec![]), // Early exit!
                MightContain::Maybe(blocks) => {
                    // Only scan these blocks
                    candidate_row_ids = Some(self.get_row_ids_in_blocks(table_name, &blocks)?);
                }
                MightContain::Unknown => {} // Fall through to full scan
            }
        }
    }

    // 3. Scan only candidate rows (or full scan if no SMFI info)
    if let Some(row_ids) = candidate_row_ids {
        // Scan only specific rows
        self.scan_rows_by_ids(table_name, &row_ids, predicates, schema)
    } else {
        // Full scan with predicate filtering during iteration
        self.scan_table_with_filter(table_name, predicates, schema)
    }
}

fn scan_table_with_filter(
    &self,
    table_name: &str,
    predicates: &[AnalyzedPredicate],
    schema: &Schema,
) -> Result<Vec<Tuple>> {
    let mut results = Vec::new();
    let prefix = format!("data:{}:", table_name);

    // Iterate with early filtering
    for item in self.db.prefix_iterator(prefix.as_bytes()) {
        let (key, value) = item?;

        // Deserialize
        let tuple: Tuple = bincode::deserialize(&value)?;

        // FILTER DURING ITERATION (not after)
        if self.tuple_matches_predicates(&tuple, predicates, schema) {
            results.push(tuple);

            // Early termination for LIMIT (if supported)
            // ...
        }
    }

    Ok(results)
}

5. SMFI as Fallback for Non-Indexed Columns

VALIDATED: This is the desired behavior.

Query Execution Priority:

1. Check for explicit index on predicate column(s)
   └─ YES → Use index for O(1) or O(log k) lookup
   └─ NO  → Continue to step 2

2. Check SMFI bloom filter
   └─ DefinitelyNot → Return empty result immediately
   └─ Maybe → Get candidate blocks, scan only those
   └─ Unknown → Continue to step 3

3. Check SMFI zone map
   └─ Value outside all ranges → Return empty result
   └─ Value in some blocks → Scan only those blocks
   └─ No zone map → Continue to step 4

4. Full scan with predicate filtering during iteration

Example Query Flow:

SELECT * FROM orders WHERE customer_id = 12345;

1. Check index on `customer_id`
   - No explicit index created by user
   - Not PRIMARY KEY or UNIQUE
   → No index available

2. Check bloom filter for `orders.customer_id`
   - Bloom filter says: might_contain(12345) = true
   → Cannot skip

3. Check zone map for `orders.customer_id`
   - Block 0: min=1, max=5000 → SKIP
   - Block 1: min=5001, max=15000 → CANDIDATE (contains 12345)
   - Block 2: min=15001, max=25000 → SKIP
   → Scan only Block 1

4. Scan Block 1 rows with filter
   - Only ~1000 rows instead of 3000
   - Apply predicate during iteration
   → Return matching rows

Implementation Roadmap

Phase 1: ART Index Infrastructure (Week 1-2)

Files to Create:

src/storage/art_index.rs - Core ART implementation
src/storage/art_node.rs - Node types (Node4/16/48/256)
src/storage/index_manager.rs - Index lifecycle management

Files to Modify:

src/storage/mod.rs - Export new modules
src/storage/catalog.rs - Auto-create indexes on PK/UNIQUE
src/storage/engine.rs - Index management methods

Key APIs:

// Index creation
pub fn create_art_index(&self, name: &str, table: &str, columns: &[String], index_type: IndexType) -> Result<()>;

// Index lookup
pub fn index_lookup(&self, table: &str, column: &str, value: &Value) -> Result<Option<u64>>;

// Index maintenance
pub fn index_insert(&self, table: &str, column: &str, value: &Value, row_id: u64) -> Result<()>;
pub fn index_delete(&self, table: &str, column: &str, value: &Value) -> Result<()>;

Phase 2: Async SMFI Manager (Week 2-3)

Files to Create:

src/storage/smfi_manager.rs - Unified SMFI management
src/storage/smfi_builder.rs - Background builder worker

Files to Modify:

src/storage/predicate_pushdown.rs - Integrate with SMFI manager
src/storage/engine.rs - SMFI hooks on INSERT/UPDATE/DELETE

Key APIs:

// SMFI queries
pub fn might_contain(&self, table: &str, column: &str, value: &Value) -> MightContain;
pub fn get_candidate_blocks(&self, table: &str, column: &str, value: &Value) -> Vec<BlockId>;

// Async building
pub async fn build_smfi_async(&self, table: &str);
pub fn trigger_rebuild(&self, table: &str);

// Incremental updates
pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);
pub fn on_delete(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema);

Phase 3: Predicate Pushdown Rewrite (Week 3-4)

Files to Modify:

src/sql/executor/scan.rs - Pass predicates to storage
src/storage/engine.rs - New scan_table_with_predicates() method
src/optimizer/rules.rs - Optimize predicate ordering

Key Changes:

// BEFORE (scan.rs:452)
let base_tuples = storage.scan_table_branch_aware(&actual_table_name)?;
storage.predicate_pushdown().scan_with_pushdown(...);

// AFTER
let tuples = storage.scan_table_with_predicates(
    &actual_table_name,
    &analyzed_predicates,
    &schema,
)?;

Phase 4: Integration & Testing (Week 4-5)

Test Cases:

Point lookup on PRIMARY KEY → <1ms
Point lookup on UNIQUE → <1ms
Point lookup on non-indexed with SMFI → <10ms (block skip)
Range query with zone map → Block skipping verified
INSERT maintains all indexes → No performance regression
Concurrent access → Lock-free correctness

Benchmarks:

# Point lookup benchmark
cargo bench --bench index_lookup -- point_lookup

# Expected results:
# - 100K rows: <1ms (was 150ms)
# - 1M rows: <1ms (was 1500ms)
# - 10M rows: <2ms (would be 15000ms)

Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                         Query Executor                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────────────┐   │
│  │   Parser    │ → │   Planner   │ → │ Predicate Analyzer  │   │
│  └─────────────┘   └─────────────┘   └─────────────────────┘   │
│                                              │                  │
│                                              ▼                  │
│                    ┌─────────────────────────────────────────┐  │
│                    │         Index/SMFI Lookup              │  │
│                    │                                        │  │
│                    │  1. Check ART Index (PK/UNIQUE)        │  │
│                    │     └─ O(k) direct lookup              │  │
│                    │                                        │  │
│                    │  2. Check SMFI Bloom Filter            │  │
│                    │     └─ Definitely not? → Return empty  │  │
│                    │                                        │  │
│                    │  3. Check SMFI Zone Map                │  │
│                    │     └─ Get candidate blocks            │  │
│                    │                                        │  │
│                    │  4. Filtered Scan (last resort)        │  │
│                    │     └─ Filter during iteration         │  │
│                    └─────────────────────────────────────────┘  │
│                                              │                  │
└──────────────────────────────────────────────│──────────────────┘
                                               │
┌──────────────────────────────────────────────▼──────────────────┐
│                       Storage Engine                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐  │
│  │   ART Indexes   │  │  SMFI Manager   │  │   RocksDB      │  │
│  │                 │  │                 │  │  (LSM-tree)    │  │
│  │ - PK Index      │  │ - Bloom Filters │  │                │  │
│  │ - UNIQUE Index  │  │ - Zone Maps     │  │ - Persistence  │  │
│  │ - Secondary     │  │ - Async Build   │  │ - Compression  │  │
│  └─────────────────┘  └─────────────────┘  └────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Success Metrics

Metric	Current	Target	Improvement
Point lookup (PK)	150ms	<1ms	150x
Point lookup (UNIQUE)	150ms	<1ms	150x
Point lookup (non-indexed)	150ms	<10ms	15x
Range query with zone map	150ms	<5ms	30x
INSERT throughput	1,280/s	2,500,000/s	1,953x
Index memory overhead	0	~10 bytes/row	N/A

Risk Mitigation

ART Implementation Complexity
- Mitigation: Use proven Rust libraries (congee for concurrent ART)
- Fallback: Simple B-tree if ART proves too complex
Index Maintenance Overhead
- Mitigation: Batch index updates during bulk load
- Use bulk_load_mode to defer index updates
Memory Usage
- Mitigation: ART is space-efficient (8.1 bytes/key)
- Add memory budget configuration
Backward Compatibility
- Existing tables without indexes continue to work
- Auto-create indexes only on new tables (migration path for existing)

HeliosDB-Lite Index & SMFI Optimization Roadmap

HeliosDB-Lite Index & SMFI Optimization Roadmap

Executive Summary

Validated Requirements

1. Implement ART-based Indexes (Adaptive Radix Tree)

2. Async SMFI Generation and Early Usage

3. Auto-Create Indexes on PRIMARY KEY and UNIQUE

4. Push Predicates INTO Storage Iterator

5. SMFI as Fallback for Non-Indexed Columns

Implementation Roadmap

Phase 1: ART Index Infrastructure (Week 1-2)

Phase 2: Async SMFI Manager (Week 2-3)

Phase 3: Predicate Pushdown Rewrite (Week 3-4)

Phase 4: Integration & Testing (Week 4-5)

Architecture Diagram

Success Metrics

Risk Mitigation

Appendix: Research Sources