HeliosDB Cost Optimizer Architecture
HeliosDB Cost Optimizer Architecture
Date: November 11, 2025
Status: Architecture Documentation
Crates: heliosdb-cost-management & heliosdb-cost-optimizer-v2
Overview
HeliosDB includes two distinct optimizer crates that serve different but complementary purposes:
heliosdb-cost-management- Financial cost management and multi-cloud optimizationheliosdb-cost-optimizer-v2- Query plan optimization and enhanced EXPLAIN functionality
Important: Despite similar names, these crates address different optimization domains and should not be consolidated.
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐│ HeliosDB System │├─────────────────────────────────────────────────────────────┤│ ││ ┌────────────────────────┐ ┌──────────────────────────┐ ││ │ │ │ │ ││ │ Query Execution │ │ Financial Tracking │ ││ │ Engine │ │ System │ ││ │ │ │ │ ││ └──────────┬─────────────┘ └────────┬─────────────────┘ ││ │ │ ││ ▼ ▼ ││ ┌──────────────────────┐ ┌──────────────────────────┐ ││ │ cost-optimizer-v2 │ │ cost-optimizer │ ││ │ (Query Optimization) │ │ (Financial Tracking) │ ││ ├──────────────────────┤ ├──────────────────────────┤ ││ │ • Cardinality Est. │ │ • Cost Attribution │ ││ │ • Join Ordering │ │ • Budget Management │ ││ │ • Index Selection │ │ • Usage Forecasting │ ││ │ • Plan Enumeration │ │ • Multi-Cloud Analysis │ ││ │ • Enhanced EXPLAIN │ │ • Showback/Chargeback │ ││ │ • Statistics │ │ • Pricing Models │ ││ └──────────────────────┘ └──────────────────────────┘ ││ │ │ ││ └───────────┬───────────────┘ ││ ▼ ││ ┌──────────────────────┐ ││ │ Storage & Compute │ ││ │ Infrastructure │ ││ └──────────────────────┘ │└─────────────────────────────────────────────────────────────┘Crate Comparison
| Aspect | cost-optimizer | cost-optimizer-v2 |
|---|---|---|
| Primary Purpose | Financial cost management | Query plan optimization |
| Domain | Business/Finance (F6.20) | Database Engine (F5.2.6) |
| LOC | 6,750 lines | 11,496 lines |
| Files | 16 modules | 21 modules |
| Usage | 4 references | 17 references |
| Unit of Cost | USD/Credits | Query execution time |
| Users | DBAs, Finance teams | Query optimizer |
| Output | Cost reports, forecasts | Query execution plans |
| Optimization | Budget allocation | Query performance |
heliosdb-cost-management (v1) - Financial Cost Management
Purpose
Tracks and optimizes financial costs for running HeliosDB workloads across different infrastructure providers.
Core Functionality
1. Cost Attribution (F6.20)
use heliosdb_cost_management::{CostAttributor, AttributionScope};
let attributor = CostAttributor::new();
// Track per-user costsattributor.track_query_cost( user_id: "user123", query_id: "q456", cost_usd: 0.0042,);
// Get user's total costslet user_cost = attributor.get_user_cost("user123", time_range);// Returns: $42.50 for this billing period2. Budget Management
use heliosdb_cost_management::BudgetManager;
let mut budget = BudgetManager::new();
// Set budget limitsbudget.set_budget( tenant: "acme-corp", limit_usd: 1000.0, period: BudgetPeriod::Monthly,);
// Get alerts when approaching limitif budget.check_threshold("acme-corp") > 0.8 { // Send alert: 80% of budget consumed}3. Cost Forecasting
use heliosdb_cost_management::CostForecaster;
let forecaster = CostForecaster::new();
// Predict next month's costslet forecast = forecaster.forecast( historical_data: &cost_history, horizon_days: 30,);// Returns: Estimated $1,250 ± $150 for next month4. Multi-Cloud Cost Analysis
use heliosdb_cost_management::CloudAnalyzer;
let analyzer = CloudAnalyzer::new();
// Compare costs across providerslet comparison = analyzer.compare_providers( workload: &workload_profile, providers: vec!["aws", "azure", "gcp"],);
// Returns:// AWS: $842/mo (current)// Azure: $731/mo (13% savings)// GCP: $798/mo (5% savings)Key Modules
attribution.rs- Per-user/tenant/project cost attributionbudget.rs- Budget limits, alerts, enforcementforecasting.rs- Time-series cost forecastingpricing.rs- Pricing models (compute, storage, network)tracker.rs- Real-time cost tracking (<100μs overhead)analyzer.rs- Multi-cloud cost analysisrecommendations.rs- Cost-saving recommendations
Use Cases
- Showback/Chargeback - Internal billing for database usage
- Budget Enforcement - Prevent cost overruns
- Cost Optimization - Right-size infrastructure
- Multi-Cloud Planning - Compare provider costs
- Capacity Planning - Forecast future costs
heliosdb-cost-optimizer-v2 - Query Plan Optimization
Purpose
Optimizes query execution plans through cost-based optimization and provides enhanced EXPLAIN functionality.
Core Functionality
1. Query Plan Optimization (F5.2.6)
use heliosdb_cost_optimizer_v2::{PlanEnumerator, CostModel};
let enumerator = PlanEnumerator::new();let cost_model = CostModel::new();
// Generate alternative query planslet plans = enumerator.enumerate_plans(&query);
// Estimate costs for each planfor plan in plans { let cost = cost_model.estimate_cost(&plan); // Returns: Plan execution time estimate in milliseconds}
// Select lowest-cost planlet best_plan = plans.iter().min_by_key(|p| cost_model.estimate_cost(p));2. Cardinality Estimation
use heliosdb_cost_optimizer_v2::CardinalityEstimator;
let estimator = CardinalityEstimator::new();
// Estimate result set sizelet cardinality = estimator.estimate( table: "orders", predicate: "region = 'US' AND amount > 1000",);// Returns: ~12,500 rows (95%+ accuracy with histograms)3. Enhanced EXPLAIN (NEW - Phases 1-3)
use heliosdb_cost_optimizer_v2::{OptimizerContext, FeatureRegistry};
let mut ctx = OptimizerContext::new();
// Track features usedctx.activate_feature("hash_join", "work_mem sufficient for hash table");ctx.activate_feature("partition_pruning", "Static partition key detected");
// Record optimizer decisionsctx.record_decision_with_cost( decision_type: "Join Strategy", chosen: "Hash Join", chosen_cost: 1245.0, alternatives: vec![ ("Nested Loop", 8420.0, "Too slow for large outer relation"), ("Merge Join", 2100.0, "Would require sort on both inputs"), ], reasoning: "Hash join is fastest with available work_mem",);
// Generate EXPLAIN outputlet explain = ctx.generate_explain_output();// Returns: Comprehensive plan with WHY/WHY NOT reasoning4. Statistics Collection
use heliosdb_cost_optimizer_v2::StatisticsCollector;
let mut collector = StatisticsCollector::new();
// Analyze table statisticscollector.analyze_table("orders", sample_rate: 0.1)?;
// Returns:// - Row count: 1,245,891// - Distinct values per column// - Most common values (MCV)// - Histograms (equi-depth)// - Null fraction5. Join Optimization
use heliosdb_cost_optimizer_v2::JoinOptimizer;
let optimizer = JoinOptimizer::new();
// Optimize join order for multi-table querylet optimized_order = optimizer.optimize_join_order( tables: vec!["orders", "customers", "products"], join_conditions: &join_predicates,);
// Returns: Best join order using dynamic programming// Example: customers ⋈ orders ⋈ products (cost: 2,450ms)Key Modules
cardinality.rs- Cardinality estimation (21,277 LOC)cost_model.rs- Query cost modeling (25,522 LOC)plan_enumerator.rs- Plan generation (21,819 LOC)statistics.rs- Statistics collection (19,851 LOC)features.rs- Feature detection (40,820 LOC) Phase 1context.rs- Optimizer reasoning (20,878 LOC) Phase 2config_tracker.rs- Configuration tracking (21,948 LOC) Phase 3
Use Cases
- Automatic Query Optimization - Find fastest execution plan
- Performance Tuning - Understand why queries are slow
- Index Design - Identify missing indexes
- Query Troubleshooting - Debug performance issues
- Capacity Planning - Predict query resource usage
When to Use Each Crate
Use heliosdb-cost-management when:
You need to track financial costs of database operations You want per-user/tenant cost attribution You need budget limits and alerts You want to forecast future costs You’re doing multi-cloud cost comparison You need showback/chargeback for internal billing You want cost-saving recommendations
Example Scenario: A SaaS company wants to track database costs per customer and send alerts when any customer approaches their budget limit.
Use heliosdb-cost-optimizer-v2 when:
You need to optimize query execution plans You want cardinality estimation for queries You need join order optimization You want to understand optimizer decisions (EXPLAIN) You need statistics collection (ANALYZE) You’re tuning query performance You want enhanced EXPLAIN output with reasoning
Example Scenario: A DBA wants to understand why a query is slow and see what alternative execution strategies the optimizer considered.
Overlap Analysis
Overlapping Functionality (~20%)
Both crates have modules for:
| Feature | cost-optimizer | cost-optimizer-v2 | Note |
|---|---|---|---|
| Budget Management | budget.rs | budget_manager.rs | Different contexts |
| Cost Forecasting | forecasting.rs | forecaster.rs | USD vs query time |
| Cost Tracking | tracker.rs | real_time_tracker.rs | Financial vs execution |
| Metrics | analytics.rs | metrics.rs | Different metrics |
Why the overlap exists:
cost-optimizertracks financial costs (USD/credits)cost-optimizer-v2tracks query execution costs (milliseconds)
Example:
// cost-optimizer: Financial budgetbudget.set_budget(tenant: "acme", limit_usd: 1000.0);
// cost-optimizer-v2: Query execution budgetbudget.set_budget(query_timeout_ms: 5000); // Kill queries >5sDespite similar module names, they solve different problems in different domains.
Integration Example
Both crates can be used together in a production system:
use heliosdb_cost_management::{CostAttributor, BudgetManager};use heliosdb_cost_optimizer_v2::{OptimizerContext, PlanSelector};
// 1. Optimize query plan (cost-optimizer-v2)let mut optimizer_ctx = OptimizerContext::new();let plan = optimize_query(&query, &mut optimizer_ctx)?;
// 2. Estimate execution cost (cost-optimizer-v2)let execution_time_ms = estimate_execution_time(&plan);
// 3. Convert to financial cost (cost-optimizer)let cost_usd = execution_time_ms * COST_PER_MS;
// 4. Check budget (cost-optimizer)let mut budget = BudgetManager::new();if !budget.can_afford(tenant, cost_usd) { return Err("Budget exceeded");}
// 5. Execute querylet result = execute_plan(&plan)?;
// 6. Track actual cost (cost-optimizer)let attributor = CostAttributor::new();attributor.track_query_cost( tenant: tenant, query_id: query.id, cost_usd: cost_usd,);Design Decisions
Why Not Consolidate?
Decision: Keep crates separate Rationale:
-
Different Purposes
- Financial cost management ≠ Query optimization
- Different users (Finance vs Query Engine)
- Different outputs (Cost reports vs Execution plans)
-
Minimal Overlap (20%)
- Only budget/forecasting modules overlap
- Even overlapping modules solve different problems
- 80% of functionality is unique to each crate
-
Clear Separation of Concerns
cost-optimizer= Business logiccost-optimizer-v2= Database engine logic- Mixing them would reduce clarity
-
Independent Evolution
- Financial cost tracking evolves with pricing models
- Query optimization evolves with engine features
- Different release cycles, different dependencies
-
Faster Compilation
- Separate crates can compile in parallel
- Combined crate would be 18K+ LOC (slower builds)
Why “v2” Naming?
Historical Context: The “v2” suffix indicates this is a different crate with a different purpose, not a newer version of cost-optimizer.
Recommended Rename: heliosdb-cost-optimizer-v2 → heliosdb-query-optimizer
- Clearer purpose
- No version confusion
- Better discoverability
Module Responsibilities
cost-optimizer Modules
| Module | Responsibility | LOC | F6.20 |
|---|---|---|---|
attribution.rs | Per-user/tenant cost tracking | ~850 | |
budget.rs | Budget limits and enforcement | ~620 | |
forecasting.rs | Time-series cost forecasting | ~710 | |
pricing.rs | Pricing models (compute/storage) | ~540 | |
tracker.rs | Real-time cost tracking | ~480 | |
analyzer.rs | Multi-cloud cost analysis | ~890 | Legacy |
recommendations.rs | Cost-saving recommendations | ~760 | Legacy |
cost-optimizer-v2 Modules
| Module | Responsibility | LOC | F5.2.6 |
|---|---|---|---|
cardinality.rs | Cardinality estimation | 21,277 | |
cost_model.rs | Query cost modeling | 25,522 | |
plan_enumerator.rs | Plan generation | 21,819 | |
statistics.rs | Statistics collection | 19,851 | |
features.rs | Feature detection (Phase 1) | 40,820 | NEW |
context.rs | Optimizer reasoning (Phase 2) | 20,878 | NEW |
config_tracker.rs | Config tracking (Phase 3) | 21,948 | NEW |
Performance Characteristics
cost-optimizer
- Tracking Overhead: <100μs per query
- Memory Usage: ~10MB base + ~100 bytes per tracked query
- Forecasting Latency: ~50ms for 30-day forecast
- Attribution Queries: <5ms for user cost lookups
cost-optimizer-v2
- Cardinality Estimation: ~2-10ms per table
- Plan Enumeration: ~50-500ms for complex queries
- Statistics Collection: ~100ms per million rows (sampling)
- EXPLAIN Generation: <1ms (minimal overhead)
Configuration
cost-optimizer Configuration
# heliosdb.conf[cost_optimizer]cost_tracking_enabled = truecost_tracking_precision = "microseconds"attribution_granularity = "per_user"budget_enforcement = trueforecast_horizon_days = 30multi_cloud_analysis = true
[pricing]compute_cost_per_vcpu_hour = 0.045 # USDstorage_cost_per_gb_month = 0.10 # USDnetwork_cost_per_gb = 0.02 # USDcost-optimizer-v2 Configuration
# heliosdb.conf[query_optimizer]work_mem = "512MB"effective_cache_size = "8GB"max_parallel_workers_per_gather = 16random_page_cost = 1.1 # SSD-optimized
enable_hashjoin = trueenable_mergejoin = trueenable_nestloop = true
statistics_target = 100cardinality_accuracy_target = 0.95Future Considerations
Potential Extraction (If Overlap Grows)
Monitor: If overlap grows beyond 40%, consider extracting common modules.
Candidate for Extraction:
heliosdb-cost-common/ ├─ budget.rs (unified budget management) ├─ forecasting.rs (unified forecasting) └─ types.rs (shared types)Current Status: Not needed (overlap is only 20%)
Potential Rename
Proposed:
heliosdb-cost-management → heliosdb-cost-management (keep)heliosdb-cost-optimizer-v2 → heliosdb-query-optimizer (rename)Benefits:
- Clearer purpose
- No version confusion
- Better discoverability
Effort: 1-2 hours (mechanical rename)
Summary
Quick Reference
| Question | Answer |
|---|---|
| Should I consolidate the crates? | ❌ No - different purposes |
| Which crate for financial tracking? | cost-optimizer |
| Which crate for query optimization? | cost-optimizer-v2 |
| Can I use both together? | Yes - complementary |
| Which is more actively used? | cost-optimizer-v2 (17 vs 4 refs) |
| Should I rename cost-optimizer-v2? | Optional but recommended |
Key Takeaways
- Different Domains: Financial vs Query Optimization
- Minimal Overlap: Only 20% (budget/forecasting)
- Complementary: Can be used together
- Keep Separate: Clear separation of concerns
- Optional Rename: Consider
query-optimizerfor clarity
Document Version: 1.0 Last Updated: November 11, 2025 Authors: HeliosDB Development Team Related Documents: