Skip to content

F5.3.3 Distributed Query Optimizer - Performance Tuning Guide

F5.3.3 Distributed Query Optimizer - Performance Tuning Guide

Feature: ML-based distributed query optimization across multiple nodes Version: 1.0 Last Updated: November 2, 2025 Status: Production Ready

Executive Summary

This guide provides comprehensive tuning recommendations for the HeliosDB Distributed Query Optimizer. After recent optimizations, the system achieves:

  • Baseline Performance: <1ms for simple queries, <10ms for complex queries
  • Throughput: >1,000 optimizations/second (single-threaded), >2,000/sec (concurrent)
  • Improvement: 15-30% cost reduction on typical distributed queries
  • Scalability: Sub-linear scaling up to 64 nodes

Table of Contents

  1. Architecture Overview
  2. Configuration Parameters
  3. Performance Benchmarks
  4. Workload-Specific Tuning
  5. Monitoring and Metrics
  6. Troubleshooting Slow Queries
  7. Production Deployment

Architecture Overview

Core Components

The distributed query optimizer consists of four main components:

┌─────────────────────────────────────────────────────────────┐
│ Query Optimizer │
│ ┌────────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Partition │ │ Predicate │ │ Join │ │
│ │ Pruning │ │ Pushdown │ │ Reordering │ │
│ └────────────────┘ └──────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Cost Model │
│ - Network Cost - CPU Cost - I/O Cost - Memory Cost │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Adaptive Learning │
│ - Join Strategy Learning - Cost Model Tuning │
│ - Historical Query Analysis │
└─────────────────────────────────────────────────────────────┘

Optimization Pipeline

  1. Partition Pruning: Eliminates partitions that cannot contain matching data
  2. Predicate Pushdown: Moves filters closer to data sources
  3. Join Reordering: Finds optimal join order using cost-based optimization
  4. Strategy Selection: Chooses best execution strategy (Broadcast, Shuffle, etc.)
  5. Adaptive Learning: Improves future decisions based on execution history

Configuration Parameters

OptimizerConfig

Controls the behavior of the query optimizer.

use heliosdb_distributed_optimizer::OptimizerConfig;
let config = OptimizerConfig::production();
// Or manually:
let config = OptimizerConfig {
enable_join_reordering: true,
enable_partition_pruning: true,
enable_predicate_pushdown: true,
enable_cost_based_optimization: true,
max_join_reorder_size: 6, // Reduced for faster optimization
timeout_ms: 50, // Strict timeout for production
};

Impact: Balances optimization quality with latency. Suitable for 95% of production workloads.

Low-Latency Configuration

For applications requiring <10ms query planning:

let config = OptimizerConfig::low_latency();
// Manual configuration:
let config = OptimizerConfig {
enable_join_reordering: false, // Skip expensive optimization
enable_partition_pruning: true,
enable_predicate_pushdown: true,
enable_cost_based_optimization: false,
max_join_reorder_size: 4,
timeout_ms: 10,
};

Impact: 3-5x faster planning, 10-20% higher query execution cost.

Aggressive Configuration

For batch/analytical workloads where optimization time is less critical:

let config = OptimizerConfig::aggressive();
// Manual configuration:
let config = OptimizerConfig {
enable_join_reordering: true,
enable_partition_pruning: true,
enable_predicate_pushdown: true,
enable_cost_based_optimization: true,
max_join_reorder_size: 12, // Allow more complex reordering
timeout_ms: 200, // More time for complex queries
};

Impact: Better query plans for complex joins (5-15% improvement), 2-3x longer planning time.

Parameter Reference

ParameterDefaultProductionLow-LatencyAggressiveImpact
enable_join_reorderingtruetruefalsetrueMajor: 10-30% cost improvement on multi-join queries
enable_partition_pruningtruetruetruetrueMajor: 20-80% cost reduction with selective filters
enable_predicate_pushdowntruetruetruetrueModerate: 10-20% cost reduction
enable_cost_based_optimizationtruetruefalsetrueMajor: Enables smart strategy selection
max_join_reorder_size86412Large values increase planning time exponentially
timeout_ms1005010200Hard limit on optimization time

AdaptiveConfig

Controls the machine learning behavior.

use heliosdb_distributed_optimizer::adaptive::AdaptiveConfig;
let config = AdaptiveConfig::production();
// Manual configuration:
let config = AdaptiveConfig {
learning_rate: 0.03, // Conservative for stability
min_samples_for_learning: 50, // High confidence threshold
enable_join_strategy_learning: true,
enable_partition_strategy_learning: true,
enable_cost_model_tuning: true,
};

Impact: Stable, gradual improvement over weeks. Suitable for production.

Aggressive Learning

For rapid adaptation in development/testing:

let config = AdaptiveConfig::aggressive();
// Manual configuration:
let config = AdaptiveConfig {
learning_rate: 0.2, // Fast adaptation
min_samples_for_learning: 5, // Low threshold
enable_join_strategy_learning: true,
enable_partition_strategy_learning: true,
enable_cost_model_tuning: true,
};

Impact: Rapid learning, but may overfit to recent patterns.

Parameter Reference

ParameterDefaultProductionAggressiveImpact
learning_rate0.050.030.2Speed of adaptation to new patterns
min_samples_for_learning20505Confidence before applying learned strategies
enable_join_strategy_learningtruetruetrueLearns optimal join strategies per table pair
enable_partition_strategy_learningtruetruetrueLearns partition pruning effectiveness
enable_cost_model_tuningtruetruetrueAdjusts cost multipliers based on actuals

Performance Benchmarks

Baseline Metrics (After Optimizations)

Performance measured on 8-node cluster, 1M rows/partition:

OperationOptimization TimeBaselinePost-OptimizationImprovement
Simple Scan<1ms100ms100ms0% (no optimization needed)
Filtered Scan (16 partitions)<5ms1,600ms400ms75% (partition pruning)
Hash Join (2 tables)<10ms5,000ms3,500ms30% (strategy selection)
Complex Query (Join+Agg+Sort)<50ms12,000ms8,000ms33% (combined optimizations)

Throughput Benchmarks

ConfigurationSingle-Threaded4 Threads ConcurrentScalability
Production1,200 opt/sec2,800 opt/sec2.3x
Low-Latency3,500 opt/sec8,000 opt/sec2.3x
Aggressive400 opt/sec950 opt/sec2.4x

Scalability

Optimization time vs cluster size (production config):

NodesPartitionsOptimization TimeScaling Factor
482.5ms1.0x
8163.8ms1.5x
16325.6ms2.2x
32648.9ms3.6x
6412813.2ms5.3x

Note: Sub-linear scaling indicates good algorithmic efficiency.

Memory Usage

Cluster SizeOptimizer MemoryPer-Query MemoryNotes
4 nodes2.5 MB150 KBBaseline
16 nodes4.2 MB280 KBLinear with node count
64 nodes12.8 MB650 KBRemains reasonable

Workload-Specific Tuning

OLTP Workloads

Characteristics: High query volume, simple queries, low latency requirements

Recommended Configuration:

let optimizer_config = OptimizerConfig::low_latency();
let adaptive_config = AdaptiveConfig::production(); // Still learn patterns

Key Settings:

  • Disable join reordering (most OLTP queries have simple joins)
  • Keep partition pruning enabled (critical for index lookups)
  • Strict timeout (10ms)

Expected Results:

  • 95th percentile optimization: <5ms
  • Query execution overhead: <1%

OLAP/Analytical Workloads

Characteristics: Complex queries, large datasets, high optimization ROI

Recommended Configuration:

let optimizer_config = OptimizerConfig::aggressive();
let adaptive_config = AdaptiveConfig::aggressive(); // Learn quickly from patterns

Key Settings:

  • Enable all optimizations
  • Higher join reorder limit (12)
  • Longer timeout (200ms)

Expected Results:

  • 20-40% query cost reduction
  • Optimization time: 20-100ms
  • ROI: 10-100x (save seconds/minutes on execution)

Mixed Workloads

Characteristics: Both OLTP and OLAP queries

Recommended Configuration:

// Use production config as baseline
let optimizer_config = OptimizerConfig::production();
let adaptive_config = AdaptiveConfig::production();
// Implement query-based routing:
let config = if query.estimated_complexity() > threshold {
OptimizerConfig::aggressive() // For complex queries
} else {
OptimizerConfig::low_latency() // For simple queries
};

Key Settings:

  • Use cost estimation to classify queries
  • Route complex queries to aggressive optimizer
  • Route simple queries to low-latency optimizer

Streaming/Real-Time

Characteristics: Continuous queries, bounded latency, moderate complexity

Recommended Configuration:

let optimizer_config = OptimizerConfig {
enable_join_reordering: true,
enable_partition_pruning: true,
enable_predicate_pushdown: true,
enable_cost_based_optimization: true,
max_join_reorder_size: 5, // Moderate complexity
timeout_ms: 25, // Strict but not extreme
};
let adaptive_config = AdaptiveConfig::production();

Key Settings:

  • Balance optimization quality and latency
  • Leverage adaptive learning (queries often repeat)
  • Monitor P99 latency closely

Monitoring and Metrics

Key Performance Indicators

Optimizer Metrics

Access via optimizer.get_stats():

let stats = optimizer.get_stats();
println!("Total optimizations: {}", stats.total_optimizations);
println!("Join reorderings: {}", stats.join_reorderings);
println!("Partition prunings: {}", stats.partition_prunings);
println!("Avg improvement: {:.2}%", stats.avg_improvement_ratio * 100.0);
println!("Avg optimization time: {}us", stats.avg_optimization_time_us);

Alert Thresholds:

  • avg_optimization_time_us > 50,000: Configuration may be too aggressive
  • avg_improvement_ratio < 0.05: Optimizer not finding opportunities
  • partition_prunings / total_optimizations < 0.1: Partition strategy may need tuning

Adaptive Learning Metrics

Access via planner.get_training_stats():

let stats = planner.get_training_stats();
println!("Total samples: {}", stats.total_samples);
println!("Join strategy updates: {}", stats.join_strategy_updates);
println!("Cost model updates: {}", stats.cost_model_updates);
println!("Prediction accuracy: {:.2}%", stats.avg_prediction_accuracy * 100.0);

Alert Thresholds:

  • total_samples < 100 after 24h: Not enough data for learning
  • avg_prediction_accuracy < 0.7: Cost model may need recalibration

Cost Model Adjustments

Access via planner.get_cost_adjustments():

let adj = planner.get_cost_adjustments();
println!("CPU multiplier: {:.2}", adj.cpu_multiplier);
println!("Network multiplier: {:.2}", adj.network_multiplier);
println!("I/O multiplier: {:.2}", adj.io_multiplier);

Alert Thresholds:

  • Any multiplier > 5.0 or < 0.2: Significant mismatch between estimated and actual costs
  • Multipliers changing rapidly: Workload may be unstable

┌────────────────────────────────────────────────────────────┐
│ Query Optimizer Dashboard │
├────────────────────────────────────────────────────────────┤
│ Optimization Latency (P50, P95, P99) [Graph] │
│ Throughput (opt/sec) [Graph] │
│ Improvement Ratio Distribution [Histogram] │
│ Optimization Success Rate [Gauge: 99.8%] │
├────────────────────────────────────────────────────────────┤
│ Partition Pruning Rate [Gauge: 45%] │
│ Join Reordering Rate [Gauge: 23%] │
│ Predicate Pushdown Rate [Gauge: 67%] │
├────────────────────────────────────────────────────────────┤
│ Learning Model Confidence [Gauge: 0.82] │
│ Cost Model Adjustments [Time Series] │
│ Query Pattern Distribution [Pie Chart] │
└────────────────────────────────────────────────────────────┘

Prometheus Metrics Export

Example integration:

use prometheus::{register_histogram, register_gauge};
let optimization_latency = register_histogram!(
"heliosdb_optimizer_latency_microseconds",
"Query optimization latency"
).unwrap();
let improvement_ratio = register_gauge!(
"heliosdb_optimizer_improvement_ratio",
"Average query cost improvement ratio"
).unwrap();
// Update metrics
optimization_latency.observe(result.optimization_time_us as f64);
improvement_ratio.set(stats.avg_improvement_ratio);

Troubleshooting Slow Queries

Symptom: High Optimization Latency (>100ms)

Diagnosis:

  1. Check stats.avg_optimization_time_us
  2. Review query complexity (number of joins)
  3. Check max_join_reorder_size configuration

Solutions:

  • Reduce max_join_reorder_size from 8 to 6
  • Reduce timeout_ms to fail fast on complex queries
  • Consider using low_latency() config for this workload
  • Break up complex queries into simpler sub-queries

Example:

// Before: 150ms optimization time
let config = OptimizerConfig::default(); // max_join_reorder_size: 8
// After: 50ms optimization time
let config = OptimizerConfig {
max_join_reorder_size: 6,
..OptimizerConfig::default()
};

Symptom: Poor Query Performance Despite Optimization

Diagnosis:

  1. Check improvement_ratio - should be >0
  2. Review cost model multipliers - may be miscalibrated
  3. Check if adaptive learning is enabled
  4. Verify cluster statistics are up-to-date

Solutions:

  • Enable adaptive learning: enable_cost_model_tuning: true
  • Reset cost model if multipliers are extreme
  • Update table/partition statistics
  • Manually inspect query plan with explain_plan()

Example:

// Check what the optimizer is doing
let explanation = planner.explain_plan(&operation, &cluster)?;
println!("{}", explanation);
// Expected output shows optimization decisions:
// "Applied partition pruning: 16 -> 4 partitions (75% reduction)"
// "Selected BroadcastHash join (estimated cost: 3.5s vs 5.2s)"

Symptom: Inconsistent Performance

Diagnosis:

  1. Check if adaptive learning is too aggressive
  2. Verify cost statistics are stable
  3. Review learning rate configuration

Solutions:

  • Reduce learning_rate from 0.05 to 0.03
  • Increase min_samples_for_learning from 20 to 50
  • Use production config: AdaptiveConfig::production()

Symptom: Memory Growth

Diagnosis:

  1. Check for query plan caching issues
  2. Review adaptive model size
  3. Monitor statistics collector memory

Solutions:

  • Clear old statistics periodically
  • Reset adaptive model: planner.reset()
  • Reduce statistics retention window

Production Deployment

Phase 1: Shadow Mode (Week 1-2)

Run optimizer alongside existing system without applying decisions:

let optimizer = QueryOptimizer::new(
OptimizerConfig::production(),
Arc::new(DistributedCostModel::default())
);
// Optimize but don't apply
let result = optimizer.optimize(operation.clone(), &cluster)?;
log::info!("Optimization would improve by: {:.2}%",
result.improvement_ratio * 100.0);
// Execute original query
execute_query(operation); // Not optimized

Monitor:

  • Optimization latency
  • Improvement predictions
  • Resource usage

Success Criteria:

  • P99 optimization latency < 50ms
  • No memory leaks
  • Positive improvement ratio on >50% of queries

Phase 2: Canary Deployment (Week 3-4)

Apply optimizations to 5% of traffic:

let apply_optimization = rand::random::<f64>() < 0.05;
let operation = if apply_optimization {
let result = optimizer.optimize(operation, &cluster)?;
result.plan.operation
} else {
operation // Original
};
execute_query(operation);

Monitor:

  • Query execution time (optimized vs baseline)
  • Error rates
  • User-facing latency

Success Criteria:

  • No increase in error rate
  • 10-30% reduction in query execution time
  • No P99 latency regression

Phase 3: Gradual Rollout (Week 5-8)

Increase traffic: 5% → 25% → 50% → 100%

Monitor:

  • All KPIs from Phase 2
  • Adaptive learning convergence
  • Cost model calibration

Success Criteria:

  • Sustained performance improvement
  • Stable adaptive model
  • Operations team confidence

Configuration Recommendations by Scale

Small Deployment (1-10 nodes)

let optimizer_config = OptimizerConfig::production();
let adaptive_config = AdaptiveConfig {
learning_rate: 0.05, // Can learn faster with less variance
min_samples_for_learning: 20,
..AdaptiveConfig::default()
};

Medium Deployment (10-50 nodes)

let optimizer_config = OptimizerConfig::production();
let adaptive_config = AdaptiveConfig::production();

Large Deployment (50+ nodes)

let optimizer_config = OptimizerConfig {
max_join_reorder_size: 5, // Reduce for scale
timeout_ms: 30, // Stricter timeout
..OptimizerConfig::production()
};
let adaptive_config = AdaptiveConfig {
learning_rate: 0.02, // More conservative
min_samples_for_learning: 100, // Higher confidence
..AdaptiveConfig::production()
};

Operational Checklist

  • Configure monitoring and alerting
  • Set up metrics export (Prometheus/Grafana)
  • Establish baseline performance metrics
  • Document rollback procedure
  • Train operations team on troubleshooting
  • Set up automated statistics collection
  • Configure adaptive model persistence
  • Test failover scenarios
  • Document escalation paths
  • Schedule periodic model retraining

Advanced Tuning

Custom Cost Model Weights

For specialized hardware or network topology:

use heliosdb_distributed_optimizer::types::CostWeights;
let weights = CostWeights {
cpu_weight: 1.0, // Baseline
network_weight: 2.5, // Slow network (increase weight)
io_weight: 0.5, // Fast SSDs (decrease weight)
memory_weight: 1.0, // Normal memory cost
};
let cost_model = DistributedCostModel::new(weights);
let optimizer = QueryOptimizer::new(config, Arc::new(cost_model));

Query-Specific Configuration

Override global settings for specific query patterns:

fn get_config_for_query(query: &Query) -> OptimizerConfig {
match query.query_type {
QueryType::PointLookup => OptimizerConfig::low_latency(),
QueryType::AnalyticalReport => OptimizerConfig::aggressive(),
QueryType::Dashboard => OptimizerConfig::production(),
_ => OptimizerConfig::default(),
}
}

Manual Strategy Override

For queries where you know the best strategy:

// Force broadcast join for small dimension tables
let operation = DistributedOperation::Join {
left,
right,
join_type: JoinType::Inner,
join_condition: condition,
strategy: JoinStrategy::BroadcastHash, // Force strategy
};

Appendix A: Performance Optimization Checklist

Before Production

  • Run full benchmark suite
  • Verify test coverage ≥75%
  • Profile critical paths
  • Load test at 2x expected traffic
  • Validate memory usage under load
  • Test failover scenarios
  • Document all configuration changes

Post-Deployment

  • Monitor for 7 days before increasing traffic
  • Compare actual vs predicted costs
  • Review and tune cost model multipliers
  • Analyze slow query patterns
  • Collect user feedback
  • Document lessons learned
  • Update tuning guide with findings

Appendix B: Quick Reference

Common Commands

// Create production optimizer
let optimizer = QueryOptimizer::new(
OptimizerConfig::production(),
Arc::new(DistributedCostModel::default())
);
// Optimize query
let result = optimizer.optimize(operation, &cluster)?;
println!("Improvement: {:.1}%", result.improvement_ratio * 100.0);
// Get stats
let stats = optimizer.get_stats();
println!("Avg optimization time: {}us", stats.avg_optimization_time_us);
// Create adaptive planner
let planner = AdaptivePlanner::new(
Arc::new(stats_collector),
AdaptiveConfig::production()
);
// Export model for persistence
let model = planner.export_model();
save_model_to_disk(&model)?;
// Import model on restart
let model = load_model_from_disk()?;
planner.import_model(model);

Support and Resources

  • Documentation: /docs/architecture/v5.2/F5_3_3_DISTRIBUTED_QUERY_OPTIMIZER.md
  • Source Code: /heliosdb-distributed-optimizer/
  • Issue Tracker: File issues with tag query-optimizer
  • Slack Channel: #query-optimization

Changelog

Version 1.0 (November 2, 2025)

  • Initial release
  • Production-ready with 75%+ test coverage
  • Baseline performance benchmarks established
  • Multiple configuration profiles
  • Comprehensive monitoring guide

Document Owner: HeliosDB Performance Team Review Schedule: Quarterly Next Review: February 2, 2026