Skip to content

HeliosDB Tier 2/3 AI/ML Features User Guide

HeliosDB Tier 2/3 AI/ML Features User Guide

Version: v7.1.2 Last Updated: January 25, 2026 Status: Production Beta


Overview

HeliosDB includes 9 advanced AI/ML packages (Tier 2/3 features) that provide cutting-edge database intelligence capabilities. These features collectively represent $83M-$151M ARR potential and include 4 world-first innovations.

Package Summary

PackageCategoryInnovationCompletenessARR Potential
Neural Query PlannerQuery OptimizationWorld-First80%$9M-$20M
Schema AIData ModelingWorld-First75%$15M-$25M
RL-Based CacheCachingWorld-First70%$10M-$18M
MAB Load BalancerLoad BalancingWorld-First75%$8M-$15M
Anomaly DetectionMonitoringAdvanced75%$8M-$18M
Time-Series ForecastingAnalyticsAdvanced70%$7M-$15M
AutoML TuningConfigurationAdvanced65%$12M-$18M
Auto-IndexIndexingAdvanced70%$8M-$12M
Probabilistic StructuresData StructuresAdvanced70%$6M-$10M

1. Neural Query Planner

Location: heliosdb-ai/crates/neural-planner

Description

World’s first production deep learning-based query optimizer using Transformer encoder + Graph Neural Network architecture for plan generation.

Key Features

  • Deep Learning Query Optimization: Transformer + GNN architecture
  • Real-time Inference: Sub-5ms latency plan generation
  • Learned Cost Model: Neural network-based cost estimation
  • Beam Search Exploration: Guided plan search with neural heuristics
  • ONNX Export: Production deployment via tract-onnx

Usage

use heliosdb_neural_planner::{
planner::{NeuralPlanner, PlannerConfig},
featurizer::QueryFeaturizer,
};
// Initialize planner with custom config
let config = PlannerConfig::default()
.with_inference_batch_size(64)
.with_beam_width(5)
.with_model_type("transformer-gnn");
let planner = NeuralPlanner::new(config);
// Train on your workload
let training_queries = vec![
("SELECT * FROM users WHERE id = ?", vec!["IndexScan(users.id_idx)"]),
];
for (query, plan) in &training_queries {
planner.add_training_example(query, plan).await?;
}
planner.train(100, 0.001).await?;
// Generate optimized plans
let features = QueryFeaturizer::new().featurize(&parsed_query).await?;
let plan = planner.generate_plan(&features).await?;
// Export for production
planner.export_onnx("models/query_planner.onnx").await?;

Configuration Options

OptionDefaultDescription
inference_batch_size64Batch size for inference
beam_width5Beam search width
model_type”transformer-gnn”Model architecture
learning_rate0.001Training learning rate
max_plan_depth20Maximum plan tree depth

2. Schema AI (Generative Schema Designer)

Location: heliosdb-data/crates/schema-ai

Description

AI-powered schema design from natural language descriptions. Converts NL requirements to optimized database schemas with automatic normalization.

Key Features

  • NL-to-ERD: Natural language to Entity-Relationship Diagrams
  • Automatic Normalization: 1NF to BCNF normalization
  • Index Recommendations: AI-suggested indexes
  • Schema Evolution: Intelligent migration generation
  • Multi-dialect Support: PostgreSQL, MySQL, SQLite DDL generation

Usage

use heliosdb_schema_ai::{
SchemaGenerator, SchemaConfig, NormalizationLevel,
};
let generator = SchemaGenerator::new(SchemaConfig {
target_normalization: NormalizationLevel::ThirdNormalForm,
generate_indexes: true,
target_dialect: "postgresql",
});
// Generate schema from description
let description = "E-commerce system with users who can place orders
containing multiple products. Track inventory levels
and support product categories.";
let schema = generator.generate_from_description(description).await?;
// Output DDL
println!("{}", schema.to_ddl());
// Get ER diagram (Mermaid format)
println!("{}", schema.to_mermaid());
// Generate migration from existing schema
let migration = generator.generate_migration(
&current_schema,
&target_schema
).await?;

Schema Generation Options

OptionValuesDescription
target_normalization1NF, 2NF, 3NF, BCNFNormalization level
generate_indexestrue/falseAuto-generate indexes
target_dialectpostgresql, mysql, sqliteSQL dialect
include_constraintstrue/falseGenerate constraints

3. RL-Based Intelligent Cache

Location: heliosdb-cache/crates/rl-cache

Description

Reinforcement learning-based cache eviction using Deep Q-Network (DQN). Learns optimal caching policies from access patterns.

Key Features

  • DQN-based Eviction: Deep Q-Network for policy learning
  • Contextual Features: Uses query patterns, time-of-day, workload type
  • Adaptive Learning: Continuous improvement from production traffic
  • Multi-tier Support: Coordinates L1/L2/L3 cache hierarchies
  • Workload-aware: Different policies for OLTP vs OLAP

Usage

use heliosdb_rl_cache::{
RLCache, RLCacheConfig, FeatureExtractor,
};
let config = RLCacheConfig {
capacity_mb: 1024,
learning_rate: 0.001,
discount_factor: 0.99,
exploration_rate: 0.1,
replay_buffer_size: 10000,
};
let cache = RLCache::new(config);
// The cache learns automatically from access patterns
cache.get(&key).await; // Records access
cache.put(key, value).await; // Learns eviction policy
// Force policy update (normally automatic)
cache.update_policy().await?;
// Get cache statistics
let stats = cache.get_stats();
println!("Hit rate: {:.2}%", stats.hit_rate * 100.0);
println!("Eviction quality: {:.2}", stats.eviction_quality_score);

Configuration

OptionDefaultDescription
capacity_mb1024Cache size in MB
learning_rate0.001DQN learning rate
discount_factor0.99Future reward discount
exploration_rate0.1Epsilon for exploration
update_frequency100Steps between updates

4. Multi-Armed Bandit Load Balancer

Location: heliosdb-cluster/crates/mab-balancer

Description

LinUCB contextual bandit algorithm for intelligent request routing. Balances load while optimizing for latency and throughput.

Key Features

  • Contextual Bandits: LinUCB algorithm with context features
  • Latency Optimization: Routes to fastest available node
  • Adaptive Exploration: Balances exploration vs exploitation
  • Health-aware: Considers node health in routing decisions
  • Multi-objective: Optimizes latency, throughput, and fairness

Usage

use heliosdb_mab_balancer::{
MABBalancer, BalancerConfig, RequestContext,
};
let config = BalancerConfig {
alpha: 0.5, // Exploration parameter
context_features: vec!["query_type", "data_size", "time_of_day"],
update_batch_size: 100,
};
let balancer = MABBalancer::new(config, nodes);
// Route request with context
let context = RequestContext {
query_type: QueryType::Read,
estimated_data_size: 1024,
priority: Priority::Normal,
};
let selected_node = balancer.select_node(&context).await?;
// Record outcome for learning
balancer.record_outcome(
selected_node,
&context,
latency_ms,
success,
).await?;
// Get routing statistics
let stats = balancer.get_stats();
for (node, node_stats) in stats.per_node {
println!("{}: avg_latency={:.2}ms, selection_rate={:.2}%",
node, node_stats.avg_latency, node_stats.selection_rate * 100.0);
}

5. Anomaly Detection

Location: heliosdb-ai/crates/anomaly-detection

Description

Multi-algorithm anomaly detection for database metrics, query patterns, and data quality monitoring.

Supported Algorithms

AlgorithmUse CaseStrengths
Isolation ForestGeneralFast, handles high dimensions
Local Outlier FactorDensity-basedGood for clusters
DBSCANClusteringFinds arbitrary shapes
One-Class SVMNovelty detectionWorks with limited data
LSTM AutoencoderTime-seriesCaptures temporal patterns
Statistical (Z-score)Simple metricsInterpretable, fast

Usage

use heliosdb_anomaly_detection::{
AnomalyDetector, DetectorConfig, Algorithm,
MetricStream, AnomalyAlert,
};
let config = DetectorConfig {
algorithm: Algorithm::IsolationForest,
contamination: 0.01, // Expected anomaly rate
sensitivity: 0.8,
window_size: 100,
};
let detector = AnomalyDetector::new(config);
// Train on historical data
detector.fit(&historical_metrics).await?;
// Real-time detection
let mut stream = MetricStream::new();
while let Some(metric) = stream.next().await {
if let Some(alert) = detector.detect(&metric).await? {
println!("ANOMALY: {} (score: {:.2}, confidence: {:.2}%)",
alert.description,
alert.anomaly_score,
alert.confidence * 100.0);
}
}
// Batch detection
let anomalies = detector.detect_batch(&metrics).await?;
for anomaly in anomalies {
println!("Anomaly at index {}: {}", anomaly.index, anomaly.description);
}

Algorithm Selection Guide

Query latency monitoring → Isolation Forest or LSTM
Connection patterns → DBSCAN
Data quality checks → Statistical (Z-score)
Novel query detection → One-Class SVM
General monitoring → Ensemble (multiple algorithms)

6. Time-Series Forecasting

Location: heliosdb-ai/crates/forecasting

Description

Comprehensive time-series forecasting for capacity planning, workload prediction, and trend analysis.

Supported Algorithms

AlgorithmBest ForAccuracy
ARIMAStationary dataHigh
ProphetSeasonal + holidaysHigh
LSTMComplex patternsVery High
Exponential SmoothingSimple trendsMedium
EnsembleGeneralHighest

Usage

use heliosdb_forecasting::{
Forecaster, ForecastConfig, Algorithm,
TimeSeries, Forecast,
};
let config = ForecastConfig {
algorithm: Algorithm::AutoSelect, // Automatically choose best
forecast_horizon: 24, // Hours ahead
confidence_level: 0.95,
seasonality: Some(Seasonality::Daily),
};
let forecaster = Forecaster::new(config);
// Fit on historical data
let history = TimeSeries::from_vec(historical_values, timestamps);
forecaster.fit(&history).await?;
// Generate forecast
let forecast = forecaster.predict(24).await?;
println!("Forecast for next 24 hours:");
for point in forecast.points {
println!("{}: {:.2} (CI: {:.2} - {:.2})",
point.timestamp,
point.value,
point.lower_bound,
point.upper_bound);
}
// Capacity planning
let capacity_forecast = forecaster.forecast_capacity(
current_usage,
growth_rate,
30, // days
).await?;
println!("Estimated time to 80% capacity: {} days",
capacity_forecast.days_to_threshold(0.8));

7. AutoML Tuning

Location: heliosdb-ai/crates/automl-tuning

Description

Automatic database configuration tuning using Bayesian Optimization and Genetic Algorithms.

Key Features

  • Bayesian Optimization: Efficient hyperparameter search
  • Genetic Algorithms: Evolves optimal configurations
  • Safe Exploration: Constraints to prevent bad configs
  • A/B Testing: Validates improvements before rollout
  • Workload-aware: Different configs for different workloads

Usage

use heliosdb_automl_tuning::{
AutoTuner, TunerConfig, SearchSpace,
Parameter, ParameterType,
};
let search_space = SearchSpace::new()
.add(Parameter::new("shared_buffers", ParameterType::Memory)
.range("256MB", "8GB"))
.add(Parameter::new("work_mem", ParameterType::Memory)
.range("4MB", "256MB"))
.add(Parameter::new("max_connections", ParameterType::Integer)
.range(100, 1000))
.add(Parameter::new("effective_cache_size", ParameterType::Memory)
.range("1GB", "32GB"));
let config = TunerConfig {
algorithm: TuningAlgorithm::BayesianOptimization,
max_iterations: 50,
target_metric: "throughput",
constraints: vec![
Constraint::MaxLatency(100), // ms
Constraint::MinThroughput(1000), // qps
],
};
let tuner = AutoTuner::new(config, search_space);
// Run tuning
let result = tuner.tune(&workload_benchmark).await?;
println!("Optimal configuration found:");
for (param, value) in &result.best_config {
println!(" {} = {}", param, value);
}
println!("Improvement: {:.1}%", result.improvement_percent);
// Apply configuration (with rollback support)
tuner.apply_config(&result.best_config, RollbackPolicy::OnRegression).await?;

8. Auto-Index

Location: heliosdb-autonomous/crates/auto-index

Description

ML-based automatic index recommendation and management based on workload analysis.

Key Features

  • Workload Analysis: Learns from query patterns
  • Index Recommendations: Suggests optimal indexes
  • Impact Prediction: Estimates performance improvement
  • Automatic Creation: Creates indexes during low-traffic periods
  • Index Consolidation: Removes redundant indexes

Usage

use heliosdb_auto_index::{
AutoIndexer, IndexerConfig, WorkloadSample,
};
let config = IndexerConfig {
analysis_window: Duration::from_hours(24),
min_improvement_threshold: 0.10, // 10% improvement required
max_indexes_per_table: 10,
auto_create: true,
maintenance_window: "02:00-05:00",
};
let indexer = AutoIndexer::new(config);
// Analyze workload
indexer.record_query(&query, execution_stats).await?;
// Get recommendations
let recommendations = indexer.get_recommendations().await?;
for rec in recommendations {
println!("Table: {}", rec.table);
println!("Suggested index: {}", rec.index_definition);
println!("Estimated improvement: {:.1}%", rec.estimated_improvement * 100.0);
println!("Affected queries: {}", rec.affected_query_count);
}
// Apply recommendations
indexer.apply_recommendations(ApplyPolicy::RequireApproval).await?;
// Get index health report
let health = indexer.get_index_health().await?;
for (table, stats) in health.per_table {
println!("{}: {} indexes, {:.1}% usage efficiency",
table, stats.index_count, stats.usage_efficiency * 100.0);
}

9. Probabilistic Data Structures

Location: heliosdb-models/crates/probabilistic

Description

Memory-efficient probabilistic data structures for approximate queries.

Supported Structures

StructureUse CaseSpaceError Rate
Bloom FilterMembership testingO(n) bitsConfigurable FP
Count-Min SketchFrequency estimationO(1)Configurable
HyperLogLogCardinality estimation~1.5KB~2%
T-DigestPercentile estimationO(compression)~1%
Cuckoo FilterMembership + deleteO(n) bitsConfigurable
MinHashSimilarity estimationO(k)1/√k
SimHashNear-duplicate detectionO(1)Configurable

Usage

use heliosdb_probabilistic::{
BloomFilter, CountMinSketch, HyperLogLog,
TDigest, CuckooFilter, MinHash,
};
// Bloom Filter - membership testing
let mut bloom = BloomFilter::new(1_000_000, 0.01); // 1M items, 1% FP rate
bloom.insert(&"user123");
assert!(bloom.contains(&"user123")); // True (definitely present or FP)
assert!(!bloom.contains(&"user456")); // False (definitely not present)
// Count-Min Sketch - frequency estimation
let mut cms = CountMinSketch::new(0.01, 0.001); // 1% error, 99.9% confidence
cms.add(&"query_type_a", 1);
cms.add(&"query_type_a", 1);
println!("Estimated count: {}", cms.estimate(&"query_type_a"));
// HyperLogLog - cardinality estimation
let mut hll = HyperLogLog::new(14); // 2^14 registers
for user_id in user_ids {
hll.add(&user_id);
}
println!("Estimated unique users: {}", hll.cardinality());
// T-Digest - percentile estimation
let mut tdigest = TDigest::new(100); // compression factor
for latency in latencies {
tdigest.add(latency);
}
println!("p99 latency: {:.2}ms", tdigest.percentile(0.99));
// MinHash - similarity estimation
let minhash1 = MinHash::new(128).add_all(&set1);
let minhash2 = MinHash::new(128).add_all(&set2);
println!("Jaccard similarity: {:.2}", minhash1.similarity(&minhash2));

Integration with HeliosDB

Enabling Features

Cargo.toml
[dependencies]
heliosdb = { version = "7.1", features = [
"neural-planner",
"schema-ai",
"rl-cache",
"mab-balancer",
"anomaly-detection",
"forecasting",
"automl-tuning",
"auto-index",
"probabilistic",
]}

Configuration

heliosdb.toml
[ai.neural_planner]
enabled = true
model_path = "models/query_planner.onnx"
inference_timeout_ms = 5
fallback_to_traditional = true
[ai.schema_ai]
enabled = true
default_normalization = "3NF"
[cache.rl]
enabled = true
capacity_mb = 2048
learning_enabled = true
[cluster.mab_balancer]
enabled = true
alpha = 0.5
update_frequency = 100
[monitoring.anomaly_detection]
enabled = true
algorithm = "ensemble"
alert_threshold = 0.8
[monitoring.forecasting]
enabled = true
algorithm = "auto"
horizon_hours = 24
[tuning.automl]
enabled = true
maintenance_window = "02:00-05:00"
require_approval = true
[indexing.auto_index]
enabled = true
min_improvement_threshold = 0.10
max_indexes_per_table = 10

Best Practices

1. Start with Defaults

All packages have sensible defaults. Start with defaults and tune based on your workload.

2. Monitor Before Enabling

Monitor your workload characteristics before enabling AI features:

  • Query patterns and frequency
  • Data distribution
  • Peak vs off-peak traffic

3. Use Gradual Rollout

Enable features gradually:

  1. Start with read-only features (anomaly detection, forecasting)
  2. Enable learning features (neural planner, RL cache)
  3. Enable write features (auto-index, automl tuning)

4. Set Safety Constraints

Always configure safety constraints for features that modify behavior:

  • Max latency thresholds
  • Rollback policies
  • Approval requirements

5. Review Recommendations

AI recommendations should be reviewed before automatic application, especially for:

  • Index creation
  • Configuration changes
  • Schema modifications

Troubleshooting

Common Issues

Neural Planner slow inference

  • Check model file is loaded (not re-loading per query)
  • Reduce beam width if latency exceeds 5ms
  • Use ONNX runtime optimizations

RL Cache low hit rate

  • Allow more training time (10,000+ accesses)
  • Check exploration rate isn’t too high
  • Verify feature extraction includes relevant context

Anomaly Detection false positives

  • Increase contamination parameter
  • Use ensemble mode for higher precision
  • Train on longer historical period

AutoML Tuning not improving

  • Expand search space ranges
  • Increase iteration count
  • Check constraint feasibility

Support


This guide covers HeliosDB v7.1.2 Tier 2/3 AI/ML features.