Skip to content

HeliosDB Implicit Storage Features

HeliosDB Implicit Storage Features

Document Version: 1.0 Date: 2025-01-25 Category: Storage Features / Performance Optimization Audience: Database Administrators, Developers, Solution Architects


Executive Summary

HeliosDB includes a comprehensive set of implicit (automatic) storage features that operate transparently without explicit user configuration. These features are designed to optimize performance, reduce storage costs, and maintain data integrity “out of the box.”

This document catalogs all implicit storage features, explains their automatic behaviors, and documents which ones can be explicitly configured for advanced optimization.


Table of Contents

  1. Implicit Features Overview
  2. Compression Features
  3. Storage Tiering
  4. MVCC and Versioning
  5. Index and Filter Optimization
  6. Compaction Strategies
  7. Caching and Memory Management
  8. Sharding and Distribution
  9. Write-Ahead Logging
  10. Query Optimization Features
  11. Self-Tuning Features
  12. Configuration Reference
  13. Default Benefits Summary

1. Implicit Features Overview

1.1 What Are Implicit Features?

Implicit features are optimizations that HeliosDB applies automatically without requiring explicit SQL statements or configuration. They include:

CategoryFeaturesDefault Behavior
CompressionHCC Adaptive, Column-levelAuto-enabled with adaptive algorithm selection
TieringHot/Warm/Cold/ArchiveAuto-promotes/demotes based on access patterns
MVCCSnapshot Isolation, VersioningAlways enabled for consistency
IndexingBloom Filters, Sparse IndexesAuto-created for common query patterns
CompactionLSM-tree, Adaptive StrategyAuto-triggered based on write patterns
CachingUnified Cache, PrefetchingAuto-managed with adaptive sizing
ShardingConsistent Hashing, RebalancingAuto-distributed based on key patterns
WALWrite-Ahead LoggingAlways enabled for durability
Query OptPredicate Pushdown, SIMDAuto-applied during query planning
Self-TuningStatistics, Workload AnalysisContinuous background optimization

1.2 Design Philosophy

HeliosDB follows the “Zero-Configuration Performance” principle:

  • Sensible Defaults: Every feature has production-ready defaults
  • Adaptive Behavior: Features self-tune based on workload characteristics
  • Override Capability: Advanced users can override any automatic behavior
  • Observability: All implicit actions are logged and visible in EXPLAIN output

2. Compression Features

2.1 HCC Adaptive Compression

Location: heliosdb-storage/crates/hybrid-columnar-compression/

HeliosDB implements Oracle-compatible Hybrid Columnar Compression (HCC) with automatic algorithm selection.

Automatic Behaviors:

  • Algorithm Selection: Automatically chooses between Zstd, LZ4, Snappy, or None based on data characteristics
  • Column-Level Compression: Different algorithms per column based on data type and cardinality
  • Compression Ratio Monitoring: Tracks compression effectiveness and adjusts algorithms
  • Dictionary Encoding: Automatically applied to low-cardinality string columns

Compression Algorithm Selection Matrix:

Data CharacteristicDefault AlgorithmCompression RatioSpeed
High cardinality numericLZ42-3xFast
Low cardinality stringDictionary + Zstd10-50xMedium
Timestamp columnsDelta + LZ45-10xFast
Binary/BLOB dataZstd (level 3)3-5xMedium
Already compressedNone1xFastest

Configuration Options (SQL):

-- View current compression settings
SHOW STORAGE COMPRESSION;
-- Override compression for a table (optional)
ALTER TABLE orders SET (compression = 'zstd', compression_level = 5);
-- Disable compression for specific column
ALTER TABLE orders ALTER COLUMN raw_data SET (compression = 'none');

Source Files:

  • heliosdb-storage/crates/hybrid-columnar-compression/src/compressor.rs
  • heliosdb-storage/crates/hybrid-columnar-compression/src/adaptive.rs

2.2 Default Compression Levels

TierDefault LevelRationale
HotLZ4 (level 1)Speed prioritized
WarmZstd (level 3)Balanced
ColdZstd (level 9)Compression prioritized
ArchiveZstd (level 19)Maximum compression

3. Storage Tiering

3.1 Automatic Data Tiering

Location: heliosdb-storage/crates/intelligent-tiering/

HeliosDB automatically moves data between storage tiers based on access patterns.

Tier Definitions:

TierStorage TypeAccess PatternRetention
HotNVMe SSD / RAMFrequent (< 1 hour old)Active data
WarmSSDModerate (1h - 7d old)Recent data
ColdHDD / Object StorageInfrequent (7d - 90d)Historical
ArchiveObject Storage / TapeRare (> 90d)Compliance

Automatic Behaviors:

  • Access Pattern Tracking: Monitors read/write frequency per data block
  • Promotion on Read: Frequently accessed cold data promoted to warm/hot
  • Demotion on Age: Data automatically demoted based on last access time
  • Predictive Prefetch: ML-based prediction of data access patterns

Tiering Policies (Default):

hot_tier:
max_age: 1h
min_access_frequency: 10/min
storage: nvme_ssd
warm_tier:
max_age: 7d
min_access_frequency: 1/hour
storage: ssd
cold_tier:
max_age: 90d
min_access_frequency: 1/day
storage: hdd
archive_tier:
age: >90d
storage: object_storage
compression: zstd_19

Configuration Options (SQL):

-- View current tiering status
SELECT * FROM helios_storage_tiers;
-- Override tier retention for a table
ALTER TABLE audit_logs SET (
hot_retention = '24 hours',
warm_retention = '30 days',
cold_retention = '1 year'
);
-- Force immediate tier migration
CALL helios_migrate_to_tier('orders', 'archive', '2024-01-01');

Source Files:

  • heliosdb-storage/crates/intelligent-tiering/src/engine.rs
  • heliosdb-storage/crates/intelligent-tiering/src/policy.rs

4. MVCC and Versioning

4.1 Snapshot Isolation

Location: heliosdb-storage/crates/mvcc/

HeliosDB provides MVCC-based snapshot isolation for all transactions.

Automatic Behaviors:

  • Version Chain Management: Maintains version chains for concurrent access
  • Read Consistency: Readers see consistent snapshot without blocking writers
  • Write Conflict Detection: Automatic detection of write-write conflicts
  • Timestamp Ordering: Assigns monotonic timestamps for version ordering

Isolation Levels:

LevelDefaultPhantom ReadsNon-Repeatable ReadsDirty Reads
SerializableYes (default)PreventedPreventedPrevented
Repeatable ReadAvailablePossiblePreventedPrevented
Read CommittedAvailablePossiblePossiblePrevented

Configuration Options (SQL):

-- View current isolation level
SHOW TRANSACTION ISOLATION LEVEL;
-- Set session isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Per-query snapshot
SELECT * FROM orders AS OF TIMESTAMP '2025-01-01 00:00:00';

4.2 Automatic Garbage Collection

Automatic Behaviors:

  • Version Cleanup: Removes old versions no longer visible to any transaction
  • Adaptive GC Policies: Adjusts cleanup frequency based on write rate
  • Tombstone Compaction: Removes deleted rows after retention period

GC Policies:

PolicyBehaviorUse Case
AggressiveGC runs every 1 minuteHigh-write OLTP
Balanced (default)GC runs every 5 minutesMixed workloads
ConservativeGC runs every 30 minutesLong-running analytics

Configuration Options (SQL):

-- View GC statistics
SELECT * FROM helios_gc_stats;
-- Set GC policy
ALTER SYSTEM SET gc_policy = 'balanced';
-- Set minimum version retention
ALTER SYSTEM SET mvcc_version_retention = '1 hour';

Source Files:

  • heliosdb-storage/crates/mvcc/src/gc.rs
  • heliosdb-storage/crates/mvcc/src/version_chain.rs

5. Index and Filter Optimization

5.1 Bloom Filters

Location: heliosdb-storage/crates/bloom-filter/

Bloom filters are automatically maintained for efficient negative lookups.

Automatic Behaviors:

  • Auto-Creation: Bloom filters created for all columns used in WHERE clauses
  • Size Optimization: Filter size automatically tuned for target false positive rate
  • Hierarchical Filters: Multi-level filters for range queries

Default Configuration:

ParameterDefault ValueDescription
bloom_filter_fp_rate0.01 (1%)Target false positive rate
bloom_filter_bits_per_key10Bits allocated per unique key
bloom_filter_enabledtrueAuto-creation enabled

Configuration Options (SQL):

-- View bloom filter statistics
SELECT * FROM helios_bloom_filter_stats WHERE table_name = 'orders';
-- Disable bloom filter for specific column
ALTER TABLE orders ALTER COLUMN description SET (bloom_filter = false);
-- Set custom false positive rate
ALTER TABLE orders SET (bloom_filter_fp_rate = 0.001);

5.2 Sparse Indexes

Automatic Behaviors:

  • Auto-Creation: Sparse indexes created for sorted columns
  • Min-Max Tracking: Block-level min/max values for range pruning
  • Zone Maps: Automatic zone map maintenance for partition pruning

Source Files:

  • heliosdb-storage/crates/bloom-filter/src/builder.rs
  • heliosdb-storage/src/index/sparse.rs

6. Compaction Strategies

6.1 LSM-Tree Compaction

Location: heliosdb-storage/crates/lsm-forest/

HeliosDB uses LSM-tree storage with automatic compaction.

Compaction Strategies:

StrategyDescriptionBest For
Leveled (default)Size-tiered levels with sorted runsRead-heavy workloads
TieredTime-ordered tiersWrite-heavy workloads
FIFOFirst-in-first-outTime-series data
UniversalHybrid approachMixed workloads

Automatic Behaviors:

  • Trigger Threshold: Compaction triggered when level exceeds size ratio
  • Write Amplification Control: Limits write amplification factor
  • Concurrent Compaction: Background threads for non-blocking compaction
  • Compaction Priority: Prioritizes levels with most impact

Default Configuration:

ParameterDefault ValueDescription
compaction_styleleveledCompaction strategy
level_size_ratio10Size multiplier between levels
max_levels7Maximum LSM levels
max_write_amplification20Maximum write amplification
compaction_threads4Background compaction threads

Configuration Options (SQL):

-- View compaction statistics
SELECT * FROM helios_compaction_stats;
-- Change compaction strategy for table
ALTER TABLE time_series SET (compaction_style = 'fifo');
-- Trigger manual compaction
CALL helios_compact_table('orders');
-- Set compaction priority
ALTER TABLE orders SET (compaction_priority = 'high');

Source Files:

  • heliosdb-storage/crates/lsm-forest/src/compaction/
  • heliosdb-storage/crates/lsm-forest/src/levels.rs

7. Caching and Memory Management

7.1 Unified Cache

Location: heliosdb-cache/crates/unified-cache/

HeliosDB maintains a unified cache layer with intelligent eviction.

Cache Tiers:

TierLocationSizeEviction Policy
L1CPU CacheAutoLRU
L2RAM (Hot)25% of RAMARC (Adaptive)
L3RAM (Warm)50% of RAMClock-Pro
L4SSD CacheConfigurableFIFO

Automatic Behaviors:

  • Adaptive Sizing: Cache sizes adjust based on workload
  • Scan Resistance: Prevents large scans from evicting hot data
  • Prefetching: Predictive read-ahead for sequential access
  • Memory Pressure Detection: Automatic cache shrinking under pressure

Default Configuration:

ParameterDefault ValueDescription
cache_size_ratio0.75Fraction of available RAM
prefetch_enabledtruePredictive read-ahead
scan_resistancetrueProtect hot data from scans
adaptive_sizingtrueDynamic size adjustment

Configuration Options (SQL):

-- View cache statistics
SELECT * FROM helios_cache_stats;
-- Set cache size limit
ALTER SYSTEM SET cache_size = '8GB';
-- Disable prefetching for specific table
ALTER TABLE large_scans SET (prefetch_enabled = false);
-- Clear cache (careful in production)
CALL helios_clear_cache();

7.2 Query Result Cache

Automatic Behaviors:

  • Result Caching: Caches results of identical queries
  • Invalidation: Automatic invalidation on table modifications
  • TTL-Based Expiry: Configurable time-to-live for cached results

Source Files:

  • heliosdb-cache/crates/unified-cache/src/adaptive.rs
  • heliosdb-cache/crates/unified-cache/src/prefetch.rs

8. Sharding and Distribution

8.1 Automatic Sharding

Location: heliosdb-cluster/crates/sharding/

HeliosDB automatically distributes data across nodes using consistent hashing.

Sharding Strategies:

StrategyDescriptionBest For
Hash (default)Consistent hash on primary keyUniform distribution
RangeRange-based partitioningTime-series, ordered queries
GeoGeographic distributionMulti-region deployments
CompositeHash + Range hybridComplex access patterns

Automatic Behaviors:

  • Key Distribution: Automatically hashes keys to shards
  • Rebalancing: Automatic shard rebalancing on node changes
  • Hot Spot Detection: Identifies and splits hot shards
  • Partition Pruning: Query optimizer prunes irrelevant shards

Default Configuration:

ParameterDefault ValueDescription
sharding_strategyhashDefault sharding method
replication_factor3Copies of each shard
shard_countautoAutomatically determined
auto_rebalancetrueAutomatic rebalancing

Configuration Options (SQL):

-- View shard distribution
SELECT * FROM helios_shard_info WHERE table_name = 'orders';
-- Set sharding strategy
CREATE TABLE events (
event_id UUID,
timestamp TIMESTAMPTZ,
data JSONB
) PARTITION BY RANGE (timestamp)
SHARD BY HASH (event_id);
-- Manual shard split
CALL helios_split_shard('orders', 'shard_001');
-- View rebalancing status
SELECT * FROM helios_rebalance_status;

8.2 Consistent Hashing

Implementation Details:

  • Virtual Nodes: 1024 virtual nodes per physical node
  • Hash Function: xxHash64 for speed
  • Rebalance Threshold: Triggers when imbalance > 10%

Source Files:

  • heliosdb-cluster/crates/sharding/src/consistent_hash.rs
  • heliosdb-cluster/crates/sharding/src/elastic.rs

9. Write-Ahead Logging

9.1 WAL Configuration

Location: heliosdb-storage/crates/wal/

HeliosDB uses write-ahead logging for durability with automatic management.

Automatic Behaviors:

  • Synchronous Writes: WAL entries synced before commit acknowledgment
  • Log Rotation: Automatic rotation based on size/time
  • Checkpoint Management: Periodic checkpointing for recovery speed
  • Log Compression: Optional compression of archived WAL segments

Default Configuration:

ParameterDefault ValueDescription
wal_sync_modefsyncDurability guarantee
wal_segment_size64MBSize per WAL segment
checkpoint_interval5 minutesTime between checkpoints
wal_retention24 hoursMinimum WAL retention

Durability Modes:

ModeBehaviorDurabilityPerformance
fsync (default)Sync each commitHighestBaseline
fdatasyncSync data onlyHigh10-20% faster
asyncPeriodic syncMedium2-3x faster
offNo syncNoneFastest (dev only)

Configuration Options (SQL):

-- View WAL statistics
SELECT * FROM helios_wal_stats;
-- Change sync mode (careful!)
ALTER SYSTEM SET wal_sync_mode = 'fdatasync';
-- Set checkpoint interval
ALTER SYSTEM SET checkpoint_interval = '10 minutes';
-- Force checkpoint
CHECKPOINT;

Source Files:

  • heliosdb-storage/crates/wal/src/writer.rs
  • heliosdb-storage/crates/wal/src/checkpoint.rs

10. Query Optimization Features

10.1 Predicate Pushdown

Location: heliosdb-compute/src/optimizer/

Predicates are automatically pushed down to the storage layer.

Automatic Behaviors:

  • Filter Pushdown: WHERE clauses pushed to storage scan
  • Join Predicate Pushdown: Join conditions evaluated during scan
  • Expression Evaluation: Compatible expressions evaluated at storage level

Example:

-- Original query
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;
-- Internally optimized to:
-- 1. Bloom filter check for status = 'pending'
-- 2. Storage-level filter for amount > 100
-- 3. Only matching rows returned to query engine

10.2 Projection Pushdown

Automatic Behaviors:

  • Column Pruning: Only requested columns read from storage
  • Computed Column Deferral: Complex expressions deferred when beneficial

10.3 SIMD Vectorization

Location: heliosdb-compute/crates/simd-accel/

HeliosDB automatically uses SIMD instructions for supported operations.

Automatic Behaviors:

  • CPU Feature Detection: Detects AVX2, AVX-512 availability
  • Batch Processing: Processes data in vectorized batches
  • Fallback Mode: Graceful fallback to scalar operations

Supported Operations:

  • Numeric comparisons and arithmetic
  • String operations (LIKE, equality)
  • Aggregations (SUM, COUNT, AVG, MIN, MAX)
  • Hash computations

10.4 Parallel Execution

Automatic Behaviors:

  • Query Parallelization: Large queries split across workers
  • Parallel Scans: Multiple threads for table scans
  • Parallel Aggregation: Distributed aggregation with merge

Default Configuration:

ParameterDefault ValueDescription
parallel_workersCPU cores - 2Max parallel workers
parallel_threshold10000 rowsMin rows for parallel
parallel_scan_enabledtrueEnable parallel scans

Configuration Options (SQL):

-- View parallel execution stats
EXPLAIN ANALYZE SELECT COUNT(*) FROM orders;
-- Set max parallel workers
SET max_parallel_workers = 8;
-- Disable parallel for session
SET parallel_query_enabled = false;

Source Files:

  • heliosdb-compute/src/optimizer/predicate_pushdown.rs
  • heliosdb-compute/crates/simd-accel/src/operations.rs

11. Self-Tuning Features

11.1 Automatic Statistics

Location: heliosdb-compute/src/statistics/

HeliosDB automatically maintains query statistics.

Automatic Behaviors:

  • Sample-Based Statistics: Automatic sampling for large tables
  • Histogram Generation: Automatic histogram creation for skewed columns
  • Statistics Refresh: Background refresh based on data changes

Default Configuration:

ParameterDefault ValueDescription
auto_analyze_threshold10% changesTrigger threshold
sample_ratio0.1 (10%)Default sample size
histogram_buckets100Default bucket count

11.2 Workload Analysis

Location: heliosdb-ai/crates/workload-predictor/

HeliosDB analyzes workload patterns for optimization.

Automatic Behaviors:

  • Query Pattern Detection: Identifies common query patterns
  • Resource Prediction: Predicts resource needs for queries
  • Adaptive Configuration: Adjusts settings based on workload

11.3 Hybrid Bayesian-Genetic Optimizer (HBGDO)

Location: heliosdb-ai/crates/automl-tuning/

Advanced automatic parameter tuning using machine learning.

Automatic Behaviors:

  • Parameter Space Exploration: GA for global search, BO for refinement
  • Multi-Objective Optimization: Balances latency, throughput, resources
  • Safe Rollback: Automatic rollback if performance degrades

Source Files:

  • heliosdb-ai/crates/workload-predictor/src/analyzer.rs
  • heliosdb-ai/crates/automl-tuning/src/hybrid_optimizer.rs

12. Configuration Reference

12.1 All Configurable Parameters

-- View all implicit feature settings
SHOW ALL HELIOS_SETTINGS;
-- Common configuration commands
ALTER SYSTEM SET <parameter> = <value>; -- Cluster-wide
SET <parameter> = <value>; -- Session-level
ALTER TABLE <table> SET (<param> = <value>); -- Table-level

12.2 Configuration Hierarchy

  1. System Default - Built-in defaults
  2. Cluster Configuration - ALTER SYSTEM settings
  3. Database Configuration - Per-database overrides
  4. Table Configuration - Per-table overrides
  5. Session Configuration - Per-session overrides
  6. Query Hints - Per-query overrides

12.3 Key Configuration Groups

GroupParametersLevel
Compressioncompression, compression_levelTable
Tieringhot_retention, warm_retention, cold_retentionTable
MVCCisolation_level, version_retentionSession
Cachecache_size, prefetch_enabledSystem
Compactioncompaction_style, compaction_threadsSystem/Table
Parallelmax_parallel_workers, parallel_thresholdSession
WALwal_sync_mode, checkpoint_intervalSystem

13. Default Benefits Summary

13.1 Out-of-Box Performance Benefits

By default, without any configuration, HeliosDB provides:

BenefitFeatureTypical Improvement
Storage ReductionHCC Adaptive Compression3-10x compression
Query SpeedPredicate/Projection Pushdown2-5x faster
Memory EfficiencyUnified Cache + ARC30-50% hit rate improvement
Write PerformanceLSM + CompactionConsistent write latency
Read ConsistencyMVCC Snapshot IsolationZero read locks
DurabilityWAL + CheckpointingZero data loss
ScalabilityAuto ShardingLinear scale-out
Cost OptimizationAutomatic Tiering50-70% storage cost reduction

13.2 When to Tune

Consider manual tuning when:

  • Specific workload patterns are well understood
  • Extreme performance requirements (sub-millisecond)
  • Compliance requirements (specific durability modes)
  • Cost optimization for known data lifecycle
  • Multi-tenant isolation requirements

13.3 Monitoring Implicit Features

-- View all implicit feature activity
SELECT * FROM helios_implicit_features_status;
-- View optimization decisions in query plan
EXPLAIN (FORMAT JSON, FEATURES ON) SELECT * FROM orders WHERE status = 'pending';
-- View storage feature statistics
SELECT * FROM helios_storage_stats;
-- View cache and memory statistics
SELECT * FROM helios_memory_stats;

Appendix A: EXPLAIN Output with Implicit Features

The enhanced EXPLAIN command shows all implicit features applied:

EXPLAIN (FORMAT TEXT, FEATURES ON, WHY_NOT ON)
SELECT * FROM orders WHERE status = 'pending' AND amount > 100;

Sample Output:

Query Plan:
Scan: orders
Filter: status = 'pending' AND amount > 100
Implicit Features Active:
[X] Predicate Pushdown: status = 'pending' pushed to storage
[X] Bloom Filter: Checked for status column (1% FP rate)
[X] Compression: HCC Adaptive (Zstd level 3)
[X] Cache: L2 cache hit for orders.status index
[X] SIMD: AVX2 used for amount > 100 comparison
[X] Parallel: 4 workers assigned
Why-Not Analysis:
[ ] Index Scan: No index on (status, amount) - consider CREATE INDEX
[ ] Partition Pruning: Table not partitioned
Optimization Suggestions:
1. CREATE INDEX idx_orders_status_amount ON orders(status, amount)
- Estimated improvement: 10x for this query pattern

Appendix B: Feature Matrix by Edition

FeatureCommunityEnterpriseCloud
HCC CompressionBasicFull AdaptiveFull + S3 Integration
Storage Tiering2 tiers4 tiersUnlimited + S3/Glacier
MVCCFullFullFull
Bloom FiltersFullFullFull
CompactionLeveled onlyAll strategiesAll + Cloud-optimized
CachingL2-L3L1-L4L1-L4 + CDN
ShardingManualAutoAuto + Cross-region
WALFullFullFull + Cross-AZ
Query OptimizationFullFull + ML hintsFull + ML + Cost optimization
Self-TuningManual statsAuto statsHBGDO + Workload prediction


Appendix C: Source File Reference

All implicit features are implemented in the following locations:

Compression Features

FeatureSource FileLines
HCC Adaptiveheliosdb-storage/src/hcc/adaptive_compression.rs1-150
Time-Series Gorillaheliosdb-storage/src/timeseries/compression_v2.rs1-100
Dictionary Encodingheliosdb-storage/src/hcc/enhanced_dictionary.rs-
Delta Encodingheliosdb-storage/src/hcc/delta_encoding.rs-
RLEheliosdb-storage/src/hcc/run_length_encoding.rs-

Tiering Features

FeatureSource FileLines
Multi-Tier Policyheliosdb-storage/src/cloud/tiering_policy.rs14-95
Time-Series Tieringheliosdb-storage/src/timeseries/tiered_storage.rs24-100

MVCC Features

FeatureSource FileLines
Snapshot Managerheliosdb-storage/src/mvcc_snapshot_manager.rs16-96
Advanced GCheliosdb-storage/src/advanced_mvcc_gc.rs18-35
GC Tuningheliosdb-storage/src/gc_tuning.rs8-56

Index Features

FeatureSource FileLines
BRIN Indexheliosdb-query/crates/indexes/src/brin.rs54-82
Index Maintenanceheliosdb-storage/src/index/maintenance.rs79-100
Adaptive Selectionheliosdb-storage/src/adaptive_index_selection.rs-

Compaction Features

FeatureSource FileLines
Compaction Strategyheliosdb-storage/src/compaction.rs16-55
LSM Tuningheliosdb-storage/src/lsm_tuning.rs8-145

Caching Features

FeatureSource FileLines
Unified Cacheheliosdb-cache/src/unified/mod.rs1-83
Prefetchingheliosdb-cache/src/prefetch/mod.rs1-77
ML Predictorheliosdb-cache/src/ml_predictor/-

Query Optimization

FeatureSource FileLines
Distributed Optimizerheliosdb-query/crates/distributed-optimizer/src/optimizer.rs11-33
Cost Optimizer v2heliosdb-query/crates/cost-optimizer-v2/src/auto_optimizer.rs1-78
Statisticsheliosdb-query/crates/cost-optimizer-v2/src/statistics.rs25-95

Sharding Features

FeatureSource FileLines
Consistent Hash Ringheliosdb-cluster/src/sharding/hash_ring.rs28-84

WAL Features

FeatureSource FileLines
WAL Writerheliosdb-storage/src/wal.rs1-95

Appendix D: Actual Default Configuration Values

Based on codebase analysis, these are the exact defaults:

Storage (heliosdb-storage/src/config.rs)

memtable_size_mb: 128, // Optimized for 3-node cluster
flush_threshold: 0.9, // 90% before forced flush
write_batch_size_bytes: 4MB,
bloom_filter_fp_rate: 0.01, // 1% false positive
enable_compression: true,
compaction_threads: 4,

Compaction (heliosdb-storage/src/compaction.rs)

strategy: CompactionStrategy::SizeTiered,
min_sstables_for_compaction: 4,
level0_size_threshold: 100MB,
level_size_multiplier: 10,
gc_grace_seconds: 864000, // 10 days
max_concurrent_compactions: 4,

MVCC GC (heliosdb-storage/src/gc_tuning.rs)

max_pause_ms: 50,
trigger_threshold_percent: 75.0,
enable_incremental: true,
incremental_slice_us: 5000, // 5ms slices
enable_arena_allocation: true,
arena_count: num_cpus::get(),

Query Optimizer (distributed-optimizer/src/optimizer.rs)

enable_join_reordering: true,
enable_partition_pruning: true,
enable_predicate_pushdown: true,
enable_cost_based_optimization: true,
max_join_reorder_size: 8,
timeout_ms: 100,

BRIN Index (heliosdb-query/crates/indexes/src/brin.rs)

pages_per_range: 128,
enable_bloom_filters: true,
bloom_filter_items: 1000,
bloom_filter_fp_rate: 0.01,
enable_minmax: true,
track_nulls: true,

Tiering (heliosdb-storage/src/timeseries/tiered_storage.rs)

hot_tier: { aggregation: None, compression: Level1 }
warm_tier: { aggregation: 5min, compression: Level6 }
cold_tier: { aggregation: 1hr, compression: Level9 }
archive: { aggregation: N/A, compression: Level9 }

Document Metadata:

  • Classification: Technical Reference
  • Review Cycle: Quarterly
  • Last Updated: 2025-01-25
  • Related Documents:
    • HBGDO_VS_ORACLE_CBO_COMPARISON.md
    • PROTOCOL_FEATURE_MATRIX.md
    • USER_DOCUMENTATION_INDEX.md