Storage-Level Self-Maintaining Filter Architecture

Executive Summary

This document analyzes how to enhance HeliosDB Nano’s storage-level filtering to achieve:

Zero memory overhead for concurrent connections
Self-maintaining filter structures updated during DML operations
Parallel-ready architecture for optimizer-driven parallelism
CPU-aware background maintenance inspired by MV auto-refresh

Current State Analysis

Existing Components

Component	Location	Status	Self-Maintaining?
Bloom Filters	`bloom_filter.rs`	✓ Implemented	✗ Manual rebuild
Zone Maps	`zone_map.rs`	✓ Implemented	✗ Manual rebuild
SIMD Filtering	`simd_filter.rs`	✓ Implemented	N/A (stateless)
Delta Tracker	`mv_delta.rs`	✓ Implemented	✓ Auto on DML
CPU Monitor	`mv_scheduler.rs`	✓ Implemented	N/A

Gap: Filter structures require explicit rebuilding

Current flow:

INSERT → Delta recorded (for MV) → Bloom/Zone NOT updated
                                   ↓
                            Requires manual:
                            engine.build_bloom_filters_for_table("t")
                            engine.build_zone_maps_for_table("t", 1000)

Comparative Analysis: Database Filtering Architectures

Oracle Exadata Storage Indexes

Architecture:

┌─────────────────────────────────────────────────────────────┐
│                    Exadata Storage Cell                      │
├─────────────────────────────────────────────────────────────┤
│  Storage Index (per 1MB region, auto-maintained)            │
│  ┌─────────┬─────────┬─────────┬─────────┐                  │
│  │ Col1    │ Col2    │ Col3    │ ...     │                  │
│  │ min/max │ min/max │ min/max │         │                  │
│  └─────────┴─────────┴─────────┴─────────┘                  │
│                                                              │
│  Maintenance: Updated on EACH write (synchronous)           │
│  Storage: ~1% of data size                                  │
│  Memory: None (disk-resident, cached in storage cell RAM)   │
└─────────────────────────────────────────────────────────────┘

Key Innovation:

Automatic maintenance with NO DBA intervention
Per-region granularity (1MB default)
Synchronous update on write (no background job needed)

PostgreSQL BRIN Indexes

Architecture:

┌─────────────────────────────────────────────────────────────┐
│                    BRIN Index Structure                      │
├─────────────────────────────────────────────────────────────┤
│  Block Range Index (per N pages, default 128)               │
│  ┌─────────────────────────────────────────────┐            │
│  │ Range ID │ Min Value │ Max Value │ Has Null │            │
│  └─────────────────────────────────────────────┘            │
│                                                              │
│  Maintenance: Lazy - requires VACUUM or explicit refresh    │
│  Storage: Very small (~0.003% of table size)                │
│  Memory: Index pages loaded on demand                       │
└─────────────────────────────────────────────────────────────┘

Limitation:

NOT self-maintaining (needs VACUUM)
Best for naturally-ordered data (timestamps, sequences)
No bloom filter capability

ClickHouse Data Skipping Indexes

Architecture:

┌─────────────────────────────────────────────────────────────┐
│                 ClickHouse Granule System                    │
├─────────────────────────────────────────────────────────────┤
│  Per-granule (8192 rows default):                           │
│  ┌─────────────────────────────────────────────┐            │
│  │ minmax  │ set(N)   │ bloom_filter │ ngrambf │            │
│  │ indexes │ indexes  │ indexes      │ indexes │            │
│  └─────────────────────────────────────────────┘            │
│                                                              │
│  Maintenance: Built during MERGE (background)               │
│  Storage: Embedded in data parts                            │
│  Memory: Loaded lazily during query                         │
└─────────────────────────────────────────────────────────────┘

Key Innovation:

Multiple index types per column
Background maintenance during compaction
No memory overhead for idle tables

ScyllaDB/Cassandra Bloom Filters

Architecture:

┌─────────────────────────────────────────────────────────────┐
│                  Per-SSTable Bloom Filter                    │
├─────────────────────────────────────────────────────────────┤
│  Built during compaction/flush, stored with SSTable         │
│  ┌─────────────────────────────────────────────┐            │
│  │ Partition Key Bloom Filter (per SSTable)    │            │
│  │ Updated: On SSTable creation (immutable)    │            │
│  │ Size: ~10 bits per key                      │            │
│  └─────────────────────────────────────────────┘            │
│                                                              │
│  Memory: Optionally memory-mapped                           │
│  Maintenance: Zero (immutable once created)                 │
└─────────────────────────────────────────────────────────────┘

Proposed Architecture: Self-Maintaining Filter Index (SMFI)

Design Principles

Storage-First: All filter structures persisted to RocksDB, not memory
Delta-Driven: Leverage existing mv_delta_tracker for incremental updates
CPU-Aware: Background consolidation with CPU throttling (like MV refresh)
Parallel-Safe: Lock-free reads, append-only writes

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                    STORAGE-LEVEL FILTERING ARCHITECTURE                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                         DML OPERATIONS                              │ │
│  │  INSERT ─────┬──────────────────────────────────────────────────── │ │
│  │  UPDATE ─────┼─► mv_delta_tracker.record_*()                       │ │
│  │  DELETE ─────┘           │                                         │ │
│  └──────────────────────────┼─────────────────────────────────────────┘ │
│                             │                                            │
│                             ▼                                            │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │              FILTER INDEX DELTA TRACKER (NEW)                       │ │
│  │  ┌─────────────────────────────────────────────────────────────┐   │ │
│  │  │  Synchronous (Lightweight):                                  │   │ │
│  │  │  • Bloom filter bit-set (OR operation, ~100ns)              │   │ │
│  │  │  • Zone map min/max update (comparison, ~50ns)              │   │ │
│  │  │  • Delta log append (for background consolidation)          │   │ │
│  │  └─────────────────────────────────────────────────────────────┘   │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                             │                                            │
│                             ▼                                            │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │              STORAGE-PERSISTED FILTER STRUCTURES                    │ │
│  │                                                                     │ │
│  │  RocksDB Keys:                                                      │ │
│  │  ┌─────────────────────────────────────────────────────────────┐   │ │
│  │  │ fidx:bloom:{table}:{column}     → BloomFilter bytes         │   │ │
│  │  │ fidx:zone:{table}:{block_id}    → BlockZoneMap bytes        │   │ │
│  │  │ fidx:delta:{table}:{seq}        → FilterDelta bytes         │   │ │
│  │  │ fidx:meta:{table}               → FilterMetadata bytes      │   │ │
│  │  └─────────────────────────────────────────────────────────────┘   │ │
│  │                                                                     │ │
│  │  Memory Usage: ZERO (all disk-resident, loaded on-demand)          │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                             │                                            │
│                             ▼                                            │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │              BACKGROUND CONSOLIDATION WORKER                        │ │
│  │  (Inspired by AutoRefreshWorker)                                   │ │
│  │                                                                     │ │
│  │  Triggers:                                                          │ │
│  │  • Delta count threshold (e.g., 1000 deltas)                       │ │
│  │  • Time-based (e.g., every 5 minutes)                              │ │
│  │  • CPU availability (< 50% usage)                                  │ │
│  │                                                                     │ │
│  │  Operations:                                                        │ │
│  │  • Merge delta bloom filters into base filter                      │ │
│  │  • Recalculate zone maps from delta log                           │ │
│  │  • Compact delta log                                               │ │
│  │  • Update filter metadata                                          │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Detailed Component Design

1. Filter Index Delta Tracker

/// Lightweight synchronous filter index updates during DML
pub struct FilterIndexDeltaTracker {
    /// Storage engine reference for persistence
    db: Arc<DB>,
    /// Per-table bloom filter deltas (in-memory buffer, flushed periodically)
    bloom_deltas: DashMap<String, BloomFilterDelta>,
    /// Per-table zone map deltas
    zone_deltas: DashMap<String, ZoneMapDelta>,
    /// Delta sequence counter
    delta_seq: AtomicU64,
    /// Configuration
    config: FilterIndexConfig,
}

/// Minimal bloom filter delta (append-only)
pub struct BloomFilterDelta {
    /// Bits to OR into base filter
    pub bits_to_set: Vec<(usize, u64)>,  // (word_index, bits_mask)
    /// Row count added
    pub rows_added: u64,
}

/// Zone map delta for incremental updates
pub struct ZoneMapDelta {
    /// Column updates: (column_name, new_min, new_max)
    pub column_updates: Vec<(String, Option<Value>, Option<Value>)>,
    /// Block ID this delta applies to
    pub block_id: u64,
}

impl FilterIndexDeltaTracker {
    /// Called synchronously during INSERT (cost: ~150ns per row)
    pub fn on_insert(&self, table: &str, row_id: u64, tuple: &Tuple, schema: &Schema) {
        // 1. Update bloom filter delta (fast bit operations)
        self.update_bloom_delta(table, tuple, schema);

        // 2. Update zone map delta (fast min/max comparison)
        self.update_zone_delta(table, row_id, tuple, schema);

        // 3. Append to delta log (for crash recovery)
        self.append_delta_log(table, DeltaType::Insert, row_id, tuple);
    }

    /// Bloom update is O(1) - just compute hash and set bits
    fn update_bloom_delta(&self, table: &str, tuple: &Tuple, schema: &Schema) {
        let mut delta = self.bloom_deltas.entry(table.to_string()).or_default();
        for (i, value) in tuple.values.iter().enumerate() {
            if !matches!(value, Value::Null) {
                let (h1, h2) = self.hash_value(value);
                // Compute bit positions using Kirsch-Mitzenmacher
                for k in 0..self.config.num_hashes {
                    let bit_pos = (h1.wrapping_add(k as u64 * h2)) % self.config.bloom_bits;
                    let word_idx = bit_pos / 64;
                    let bit_mask = 1u64 << (bit_pos % 64);
                    delta.bits_to_set.push((word_idx as usize, bit_mask));
                }
            }
        }
        delta.rows_added += 1;
    }
}

2. Storage-Persisted Filter Structures

/// Filter metadata stored in RocksDB
#[derive(Serialize, Deserialize)]
pub struct FilterIndexMetadata {
    /// Table name
    pub table_name: String,
    /// Number of rows indexed
    pub indexed_rows: u64,
    /// Number of pending deltas
    pub pending_deltas: u64,
    /// Last consolidation timestamp
    pub last_consolidation: DateTime<Utc>,
    /// Bloom filter config
    pub bloom_config: BloomFilterConfig,
    /// Zone map block size
    pub zone_block_size: u64,
    /// Filter index version (for upgrades)
    pub version: u32,
}

/// RocksDB key prefixes for filter index storage
pub mod filter_index_keys {
    pub const BLOOM_PREFIX: &str = "fidx:bloom:";
    pub const ZONE_PREFIX: &str = "fidx:zone:";
    pub const DELTA_PREFIX: &str = "fidx:delta:";
    pub const META_PREFIX: &str = "fidx:meta:";
}

3. CPU-Aware Background Consolidation Worker

/// Background worker for filter index maintenance
/// Inspired by AutoRefreshWorker pattern
pub struct FilterIndexMaintenanceWorker {
    /// Storage engine
    storage: Arc<StorageEngine>,
    /// Configuration
    config: MaintenanceConfig,
    /// CPU monitor (reuse from mv_scheduler)
    cpu_monitor: Arc<CpuMonitor>,
    /// Shutdown signal
    shutdown: Arc<AtomicBool>,
}

#[derive(Clone)]
pub struct MaintenanceConfig {
    /// Maximum CPU usage before throttling (default: 15%)
    pub max_cpu_percent: f32,
    /// Delta threshold before consolidation (default: 1000)
    pub delta_threshold: u64,
    /// Time threshold before consolidation (default: 300 seconds)
    pub time_threshold_seconds: u64,
    /// Check interval (default: 10 seconds)
    pub check_interval_seconds: u64,
    /// Enable parallel consolidation
    pub parallel_consolidation: bool,
}

impl FilterIndexMaintenanceWorker {
    pub async fn run(&self) {
        loop {
            if self.shutdown.load(Ordering::Relaxed) {
                break;
            }

            // Check CPU usage before proceeding
            let cpu = self.cpu_monitor.current_usage();
            if cpu > self.config.max_cpu_percent {
                debug!("CPU too high ({:.1}%), skipping filter maintenance", cpu);
                tokio::time::sleep(Duration::from_secs(self.config.check_interval_seconds)).await;
                continue;
            }

            // Find tables needing consolidation
            let tables = self.find_tables_needing_consolidation().await;

            for table in tables {
                // Consolidate in parallel if enabled
                if self.config.parallel_consolidation {
                    self.consolidate_parallel(&table).await;
                } else {
                    self.consolidate_sequential(&table).await;
                }
            }

            tokio::time::sleep(Duration::from_secs(self.config.check_interval_seconds)).await;
        }
    }

    /// Parallel consolidation using rayon
    async fn consolidate_parallel(&self, table: &str) {
        use rayon::prelude::*;

        let deltas = self.load_pending_deltas(table);

        // Parallel bloom filter merge
        let bloom_bits: Vec<_> = deltas.par_iter()
            .flat_map(|d| d.bloom_delta.bits_to_set.par_iter())
            .collect();

        // Apply to base filter atomically
        self.apply_bloom_updates(table, &bloom_bits).await;

        // Parallel zone map updates per block
        let block_groups = self.group_zone_deltas_by_block(&deltas);
        block_groups.par_iter().for_each(|(block_id, updates)| {
            self.merge_zone_updates(table, *block_id, updates);
        });
    }
}

4. Parallel-Ready Query Execution

/// Parallel filter evaluation for FilteredScan
impl PredicatePushdownManager {
    /// Parallel scan with storage-level filtering
    pub fn scan_with_pushdown_parallel(
        &self,
        table_name: &str,
        tuples: Vec<Tuple>,
        predicates: &[AnalyzedPredicate],
        schema: &Schema,
        parallelism: usize,
    ) -> Vec<Tuple> {
        use rayon::prelude::*;

        // Phase 1: Parallel bloom filter check
        let bloom_filtered: Vec<_> = tuples.par_iter()
            .filter(|tuple| self.check_bloom_filters_parallel(table_name, tuple, predicates))
            .cloned()
            .collect();

        // Phase 2: Parallel zone map block pruning (already done at block level)

        // Phase 3: Parallel SIMD predicate evaluation
        let chunk_size = (bloom_filtered.len() / parallelism).max(64);
        let results: Vec<_> = bloom_filtered
            .par_chunks(chunk_size)
            .flat_map(|chunk| {
                self.simd_engine.filter_batch_parallel(chunk, predicates, schema)
            })
            .collect();

        results
    }
}

Innovation: Speculative Filter Indexes (SFI)

Concept

Unlike traditional indexes that require explicit creation, SFI automatically creates lightweight filter structures based on query patterns:

┌─────────────────────────────────────────────────────────────────────────┐
│                    SPECULATIVE FILTER INDEX SYSTEM                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Query Pattern Analyzer:                                                 │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  SELECT * FROM orders WHERE customer_id = ?                     │    │
│  │                      ↓                                          │    │
│  │  Pattern: equality predicate on orders.customer_id              │    │
│  │  Frequency: 847 times in last hour                              │    │
│  │  Selectivity: ~0.01%                                            │    │
│  │                      ↓                                          │    │
│  │  Decision: Create bloom filter for orders.customer_id           │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  Auto-Created Filters (no DBA intervention):                            │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  Column         │ Filter Type │ Trigger                         │    │
│  │  ───────────────┼─────────────┼───────────────────────────────  │    │
│  │  customer_id    │ Bloom       │ High equality query frequency   │    │
│  │  order_date     │ Zone Map    │ Range queries detected          │    │
│  │  status         │ Enum Set    │ Low cardinality column          │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  Lifecycle:                                                              │
│  • Created speculatively after pattern detected                         │
│  • Maintained incrementally during DML                                  │
│  • Dropped automatically if unused for configurable period              │
│  • Zero memory footprint (all storage-persisted)                        │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Implementation

/// Speculative Filter Index Manager
pub struct SpeculativeFilterManager {
    /// Query pattern tracker
    pattern_tracker: QueryPatternTracker,
    /// Filter index manager
    filter_manager: FilterIndexManager,
    /// Configuration
    config: SpeculativeConfig,
}

#[derive(Clone)]
pub struct SpeculativeConfig {
    /// Minimum query frequency before considering filter creation
    pub min_query_frequency: u64,
    /// Minimum selectivity improvement required (default: 10x)
    pub min_selectivity_improvement: f64,
    /// Days of inactivity before dropping speculative filter
    pub drop_after_days: u32,
    /// Maximum speculative filters per table
    pub max_filters_per_table: usize,
}

impl SpeculativeFilterManager {
    /// Called by optimizer after query execution
    pub fn record_query_pattern(&self, table: &str, predicate: &LogicalExpr,
                                selectivity: f64, execution_time_ms: u64) {
        let pattern = self.extract_pattern(predicate);
        self.pattern_tracker.record(table, pattern, selectivity, execution_time_ms);

        // Check if we should create a speculative filter
        if self.should_create_filter(table, &pattern) {
            self.create_speculative_filter(table, &pattern);
        }
    }

    fn should_create_filter(&self, table: &str, pattern: &QueryPattern) -> bool {
        let stats = self.pattern_tracker.get_stats(table, pattern);

        // Criteria for filter creation:
        // 1. High query frequency
        // 2. Low selectivity (few rows returned)
        // 3. No existing filter for this column
        // 4. Below max filters per table

        stats.frequency >= self.config.min_query_frequency
            && stats.avg_selectivity < 0.1  // Less than 10% of rows
            && !self.filter_exists(table, &pattern.column)
            && self.filter_count(table) < self.config.max_filters_per_table
    }
}

Innovation: Columnar Zone Summaries (CZS)

Concept

Inspired by Exadata Storage Indexes but optimized for row-store with column awareness:

┌─────────────────────────────────────────────────────────────────────────┐
│                    COLUMNAR ZONE SUMMARY ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Traditional Zone Map:                                                   │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  Block 0: min=1, max=100    (all columns combined)              │    │
│  │  Block 1: min=50, max=200                                       │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  Columnar Zone Summary (CZS):                                           │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  Block 0:                                                       │    │
│  │    ├── id:         min=1, max=100, distinct≈100, nulls=0        │    │
│  │    ├── customer:   min=1001, max=9999, distinct≈45, nulls=0     │    │
│  │    ├── total:      min=10.50, max=9999.99, avg=156.23, nulls=2  │    │
│  │    └── status:     values={pending,shipped,delivered}, nulls=0  │    │
│  │                                                                  │    │
│  │  Additional Summaries:                                          │    │
│  │    ├── Histogram buckets (for range selectivity estimation)    │    │
│  │    ├── Most common values (for equality optimization)          │    │
│  │    └── Correlation hints (for multi-column predicates)         │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│  Storage: ~2KB per block (persisted in RocksDB)                         │
│  Memory: Zero (loaded on-demand during query planning)                  │
│  Maintenance: Incremental during DML                                    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Implementation

/// Enhanced per-block column summary
#[derive(Serialize, Deserialize)]
pub struct ColumnZoneSummary {
    /// Column name
    pub column_name: String,
    /// Basic statistics
    pub min: Option<Value>,
    pub max: Option<Value>,
    pub null_count: u64,
    pub row_count: u64,
    /// Approximate distinct count (HyperLogLog)
    pub approx_distinct: u64,
    /// Average (for numeric columns)
    pub avg: Option<f64>,
    /// Sum (for numeric columns, useful for aggregation pushdown)
    pub sum: Option<f64>,
    /// Most common values (top 5) with frequencies
    pub mcv: Vec<(Value, u64)>,
    /// Histogram buckets (for selectivity estimation)
    pub histogram: Option<HistogramBuckets>,
}

/// Block-level columnar zone summary
#[derive(Serialize, Deserialize)]
pub struct BlockZoneSummary {
    pub block_id: u64,
    pub first_row_id: u64,
    pub last_row_id: u64,
    pub row_count: u64,
    pub columns: HashMap<String, ColumnZoneSummary>,
    /// Cross-column correlation hints
    pub correlations: Vec<ColumnCorrelation>,
}

impl BlockZoneSummary {
    /// Incremental update on INSERT (O(1) per column)
    pub fn update_on_insert(&mut self, tuple: &Tuple, schema: &Schema) {
        self.row_count += 1;

        for (i, col) in schema.columns.iter().enumerate() {
            let summary = self.columns.entry(col.name.clone())
                .or_insert_with(|| ColumnZoneSummary::new(&col.name));

            let value = tuple.get(i).unwrap_or(&Value::Null);
            summary.update_incremental(value);
        }
    }

    /// Can this block satisfy the predicate? (Fast check)
    pub fn can_satisfy(&self, predicates: &[AnalyzedPredicate]) -> BlockDecision {
        for pred in predicates {
            if let Some(summary) = self.columns.get(&pred.column_name) {
                match pred.evaluate_against_summary(summary) {
                    SummaryMatch::Impossible => return BlockDecision::Skip,
                    SummaryMatch::Partial => continue,
                    SummaryMatch::Full => continue,
                }
            }
        }
        BlockDecision::Scan
    }
}

Memory vs Storage Comparison

Per-Connection Memory Usage

System	Memory per Connection	Filtering Overhead
Oracle (In-Memory)	~10MB+	SIMD in memory
PostgreSQL	~5-10MB	Filter in executor
HeliosDB Nano (Current)	~2-5MB	Filter after load
HeliosDB Nano (SMFI)	~1-2MB	Filter at storage

Storage Overhead for Filter Indexes

Table Size	Bloom Filters	Zone Maps	Total Overhead
1GB	~10MB (1%)	~5MB (0.5%)	~15MB (1.5%)
10GB	~100MB	~50MB	~150MB
100GB	~1GB	~500MB	~1.5GB

Query Latency Impact

Scenario: SELECT * FROM orders WHERE customer_id = 12345 AND total > 100
Table size: 10 million rows, 10% match customer_id, 50% match total > 100

Without SMFI:
  Load all 10M rows → Filter in memory → Return 500K rows
  Memory: ~2GB peak
  Time: ~5 seconds

With SMFI:
  Bloom check customer_id → Zone map prune blocks → SIMD filter remainder
  Memory: ~50MB peak (only matching blocks loaded)
  Time: ~200ms

Implementation Roadmap

Phase 1: Self-Maintaining Filter Updates (Core)

Integrate filter delta tracking into insert_tuple(), update_tuple(), delete_tuple()
Persist filter deltas to RocksDB
Background consolidation worker with CPU awareness
Crash recovery from delta log

Phase 2: Columnar Zone Summaries (Enhanced Statistics)

Implement BlockZoneSummary with incremental updates
HyperLogLog for approximate distinct counts
Histogram buckets for selectivity estimation
Integration with optimizer cost model

Phase 3: Speculative Filter Indexes (Intelligence)

Query pattern tracker
Automatic filter creation based on patterns
Filter lifecycle management (creation, maintenance, dropping)
Optimizer hints for speculative filter usage

Phase 4: Parallel Query Support

Parallel bloom filter evaluation with rayon
Parallel zone map block pruning
Parallel SIMD filtering
Optimizer-driven parallelism decisions

Conclusion

The proposed Self-Maintaining Filter Index (SMFI) architecture provides:

Zero additional memory for concurrent connections
Automatic maintenance integrated with existing DML paths
CPU-aware background consolidation preventing system overload
Parallel-ready design for optimizer-driven parallelism
Innovative speculative indexing for workload adaptation

This aligns with HeliosDB Nano’s philosophy of memory optimization while maximizing storage-level filtering efficiency.