Skip to content

HeliosDB Cost Optimizer Architecture

HeliosDB Cost Optimizer Architecture

Date: November 11, 2025 Status: Architecture Documentation Crates: heliosdb-cost-management & heliosdb-cost-optimizer-v2


Overview

HeliosDB includes two distinct optimizer crates that serve different but complementary purposes:

  1. heliosdb-cost-management - Financial cost management and multi-cloud optimization
  2. heliosdb-cost-optimizer-v2 - Query plan optimization and enhanced EXPLAIN functionality

Important: Despite similar names, these crates address different optimization domains and should not be consolidated.


Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│ HeliosDB System │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────┐ ┌──────────────────────────┐ │
│ │ │ │ │ │
│ │ Query Execution │ │ Financial Tracking │ │
│ │ Engine │ │ System │ │
│ │ │ │ │ │
│ └──────────┬─────────────┘ └────────┬─────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ cost-optimizer-v2 │ │ cost-optimizer │ │
│ │ (Query Optimization) │ │ (Financial Tracking) │ │
│ ├──────────────────────┤ ├──────────────────────────┤ │
│ │ • Cardinality Est. │ │ • Cost Attribution │ │
│ │ • Join Ordering │ │ • Budget Management │ │
│ │ • Index Selection │ │ • Usage Forecasting │ │
│ │ • Plan Enumeration │ │ • Multi-Cloud Analysis │ │
│ │ • Enhanced EXPLAIN │ │ • Showback/Chargeback │ │
│ │ • Statistics │ │ • Pricing Models │ │
│ └──────────────────────┘ └──────────────────────────┘ │
│ │ │ │
│ └───────────┬───────────────┘ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ Storage & Compute │ │
│ │ Infrastructure │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Crate Comparison

Aspectcost-optimizercost-optimizer-v2
Primary PurposeFinancial cost managementQuery plan optimization
DomainBusiness/Finance (F6.20)Database Engine (F5.2.6)
LOC6,750 lines11,496 lines
Files16 modules21 modules
Usage4 references17 references
Unit of CostUSD/CreditsQuery execution time
UsersDBAs, Finance teamsQuery optimizer
OutputCost reports, forecastsQuery execution plans
OptimizationBudget allocationQuery performance

heliosdb-cost-management (v1) - Financial Cost Management

Purpose

Tracks and optimizes financial costs for running HeliosDB workloads across different infrastructure providers.

Core Functionality

1. Cost Attribution (F6.20)

use heliosdb_cost_management::{CostAttributor, AttributionScope};
let attributor = CostAttributor::new();
// Track per-user costs
attributor.track_query_cost(
user_id: "user123",
query_id: "q456",
cost_usd: 0.0042,
);
// Get user's total costs
let user_cost = attributor.get_user_cost("user123", time_range);
// Returns: $42.50 for this billing period

2. Budget Management

use heliosdb_cost_management::BudgetManager;
let mut budget = BudgetManager::new();
// Set budget limits
budget.set_budget(
tenant: "acme-corp",
limit_usd: 1000.0,
period: BudgetPeriod::Monthly,
);
// Get alerts when approaching limit
if budget.check_threshold("acme-corp") > 0.8 {
// Send alert: 80% of budget consumed
}

3. Cost Forecasting

use heliosdb_cost_management::CostForecaster;
let forecaster = CostForecaster::new();
// Predict next month's costs
let forecast = forecaster.forecast(
historical_data: &cost_history,
horizon_days: 30,
);
// Returns: Estimated $1,250 ± $150 for next month

4. Multi-Cloud Cost Analysis

use heliosdb_cost_management::CloudAnalyzer;
let analyzer = CloudAnalyzer::new();
// Compare costs across providers
let comparison = analyzer.compare_providers(
workload: &workload_profile,
providers: vec!["aws", "azure", "gcp"],
);
// Returns:
// AWS: $842/mo (current)
// Azure: $731/mo (13% savings)
// GCP: $798/mo (5% savings)

Key Modules

  • attribution.rs - Per-user/tenant/project cost attribution
  • budget.rs - Budget limits, alerts, enforcement
  • forecasting.rs - Time-series cost forecasting
  • pricing.rs - Pricing models (compute, storage, network)
  • tracker.rs - Real-time cost tracking (<100μs overhead)
  • analyzer.rs - Multi-cloud cost analysis
  • recommendations.rs - Cost-saving recommendations

Use Cases

  1. Showback/Chargeback - Internal billing for database usage
  2. Budget Enforcement - Prevent cost overruns
  3. Cost Optimization - Right-size infrastructure
  4. Multi-Cloud Planning - Compare provider costs
  5. Capacity Planning - Forecast future costs

heliosdb-cost-optimizer-v2 - Query Plan Optimization

Purpose

Optimizes query execution plans through cost-based optimization and provides enhanced EXPLAIN functionality.

Core Functionality

1. Query Plan Optimization (F5.2.6)

use heliosdb_cost_optimizer_v2::{PlanEnumerator, CostModel};
let enumerator = PlanEnumerator::new();
let cost_model = CostModel::new();
// Generate alternative query plans
let plans = enumerator.enumerate_plans(&query);
// Estimate costs for each plan
for plan in plans {
let cost = cost_model.estimate_cost(&plan);
// Returns: Plan execution time estimate in milliseconds
}
// Select lowest-cost plan
let best_plan = plans.iter().min_by_key(|p| cost_model.estimate_cost(p));

2. Cardinality Estimation

use heliosdb_cost_optimizer_v2::CardinalityEstimator;
let estimator = CardinalityEstimator::new();
// Estimate result set size
let cardinality = estimator.estimate(
table: "orders",
predicate: "region = 'US' AND amount > 1000",
);
// Returns: ~12,500 rows (95%+ accuracy with histograms)

3. Enhanced EXPLAIN (NEW - Phases 1-3)

use heliosdb_cost_optimizer_v2::{OptimizerContext, FeatureRegistry};
let mut ctx = OptimizerContext::new();
// Track features used
ctx.activate_feature("hash_join", "work_mem sufficient for hash table");
ctx.activate_feature("partition_pruning", "Static partition key detected");
// Record optimizer decisions
ctx.record_decision_with_cost(
decision_type: "Join Strategy",
chosen: "Hash Join",
chosen_cost: 1245.0,
alternatives: vec![
("Nested Loop", 8420.0, "Too slow for large outer relation"),
("Merge Join", 2100.0, "Would require sort on both inputs"),
],
reasoning: "Hash join is fastest with available work_mem",
);
// Generate EXPLAIN output
let explain = ctx.generate_explain_output();
// Returns: Comprehensive plan with WHY/WHY NOT reasoning

4. Statistics Collection

use heliosdb_cost_optimizer_v2::StatisticsCollector;
let mut collector = StatisticsCollector::new();
// Analyze table statistics
collector.analyze_table("orders", sample_rate: 0.1)?;
// Returns:
// - Row count: 1,245,891
// - Distinct values per column
// - Most common values (MCV)
// - Histograms (equi-depth)
// - Null fraction

5. Join Optimization

use heliosdb_cost_optimizer_v2::JoinOptimizer;
let optimizer = JoinOptimizer::new();
// Optimize join order for multi-table query
let optimized_order = optimizer.optimize_join_order(
tables: vec!["orders", "customers", "products"],
join_conditions: &join_predicates,
);
// Returns: Best join order using dynamic programming
// Example: customers ⋈ orders ⋈ products (cost: 2,450ms)

Key Modules

  • cardinality.rs - Cardinality estimation (21,277 LOC)
  • cost_model.rs - Query cost modeling (25,522 LOC)
  • plan_enumerator.rs - Plan generation (21,819 LOC)
  • statistics.rs - Statistics collection (19,851 LOC)
  • features.rs - Feature detection (40,820 LOC) Phase 1
  • context.rs - Optimizer reasoning (20,878 LOC) Phase 2
  • config_tracker.rs - Configuration tracking (21,948 LOC) Phase 3

Use Cases

  1. Automatic Query Optimization - Find fastest execution plan
  2. Performance Tuning - Understand why queries are slow
  3. Index Design - Identify missing indexes
  4. Query Troubleshooting - Debug performance issues
  5. Capacity Planning - Predict query resource usage

When to Use Each Crate

Use heliosdb-cost-management when:

You need to track financial costs of database operations You want per-user/tenant cost attribution You need budget limits and alerts You want to forecast future costs You’re doing multi-cloud cost comparison You need showback/chargeback for internal billing You want cost-saving recommendations

Example Scenario: A SaaS company wants to track database costs per customer and send alerts when any customer approaches their budget limit.

Use heliosdb-cost-optimizer-v2 when:

You need to optimize query execution plans You want cardinality estimation for queries You need join order optimization You want to understand optimizer decisions (EXPLAIN) You need statistics collection (ANALYZE) You’re tuning query performance You want enhanced EXPLAIN output with reasoning

Example Scenario: A DBA wants to understand why a query is slow and see what alternative execution strategies the optimizer considered.


Overlap Analysis

Overlapping Functionality (~20%)

Both crates have modules for:

Featurecost-optimizercost-optimizer-v2Note
Budget Managementbudget.rsbudget_manager.rsDifferent contexts
Cost Forecastingforecasting.rsforecaster.rsUSD vs query time
Cost Trackingtracker.rsreal_time_tracker.rsFinancial vs execution
Metricsanalytics.rsmetrics.rsDifferent metrics

Why the overlap exists:

  • cost-optimizer tracks financial costs (USD/credits)
  • cost-optimizer-v2 tracks query execution costs (milliseconds)

Example:

// cost-optimizer: Financial budget
budget.set_budget(tenant: "acme", limit_usd: 1000.0);
// cost-optimizer-v2: Query execution budget
budget.set_budget(query_timeout_ms: 5000); // Kill queries >5s

Despite similar module names, they solve different problems in different domains.


Integration Example

Both crates can be used together in a production system:

use heliosdb_cost_management::{CostAttributor, BudgetManager};
use heliosdb_cost_optimizer_v2::{OptimizerContext, PlanSelector};
// 1. Optimize query plan (cost-optimizer-v2)
let mut optimizer_ctx = OptimizerContext::new();
let plan = optimize_query(&query, &mut optimizer_ctx)?;
// 2. Estimate execution cost (cost-optimizer-v2)
let execution_time_ms = estimate_execution_time(&plan);
// 3. Convert to financial cost (cost-optimizer)
let cost_usd = execution_time_ms * COST_PER_MS;
// 4. Check budget (cost-optimizer)
let mut budget = BudgetManager::new();
if !budget.can_afford(tenant, cost_usd) {
return Err("Budget exceeded");
}
// 5. Execute query
let result = execute_plan(&plan)?;
// 6. Track actual cost (cost-optimizer)
let attributor = CostAttributor::new();
attributor.track_query_cost(
tenant: tenant,
query_id: query.id,
cost_usd: cost_usd,
);

Design Decisions

Why Not Consolidate?

Decision: Keep crates separate Rationale:

  1. Different Purposes

    • Financial cost management ≠ Query optimization
    • Different users (Finance vs Query Engine)
    • Different outputs (Cost reports vs Execution plans)
  2. Minimal Overlap (20%)

    • Only budget/forecasting modules overlap
    • Even overlapping modules solve different problems
    • 80% of functionality is unique to each crate
  3. Clear Separation of Concerns

    • cost-optimizer = Business logic
    • cost-optimizer-v2 = Database engine logic
    • Mixing them would reduce clarity
  4. Independent Evolution

    • Financial cost tracking evolves with pricing models
    • Query optimization evolves with engine features
    • Different release cycles, different dependencies
  5. Faster Compilation

    • Separate crates can compile in parallel
    • Combined crate would be 18K+ LOC (slower builds)

Why “v2” Naming?

Historical Context: The “v2” suffix indicates this is a different crate with a different purpose, not a newer version of cost-optimizer.

Recommended Rename: heliosdb-cost-optimizer-v2heliosdb-query-optimizer

  • Clearer purpose
  • No version confusion
  • Better discoverability

Module Responsibilities

cost-optimizer Modules

ModuleResponsibilityLOCF6.20
attribution.rsPer-user/tenant cost tracking~850
budget.rsBudget limits and enforcement~620
forecasting.rsTime-series cost forecasting~710
pricing.rsPricing models (compute/storage)~540
tracker.rsReal-time cost tracking~480
analyzer.rsMulti-cloud cost analysis~890Legacy
recommendations.rsCost-saving recommendations~760Legacy

cost-optimizer-v2 Modules

ModuleResponsibilityLOCF5.2.6
cardinality.rsCardinality estimation21,277
cost_model.rsQuery cost modeling25,522
plan_enumerator.rsPlan generation21,819
statistics.rsStatistics collection19,851
features.rsFeature detection (Phase 1)40,820NEW
context.rsOptimizer reasoning (Phase 2)20,878NEW
config_tracker.rsConfig tracking (Phase 3)21,948NEW

Performance Characteristics

cost-optimizer

  • Tracking Overhead: <100μs per query
  • Memory Usage: ~10MB base + ~100 bytes per tracked query
  • Forecasting Latency: ~50ms for 30-day forecast
  • Attribution Queries: <5ms for user cost lookups

cost-optimizer-v2

  • Cardinality Estimation: ~2-10ms per table
  • Plan Enumeration: ~50-500ms for complex queries
  • Statistics Collection: ~100ms per million rows (sampling)
  • EXPLAIN Generation: <1ms (minimal overhead)

Configuration

cost-optimizer Configuration

# heliosdb.conf
[cost_optimizer]
cost_tracking_enabled = true
cost_tracking_precision = "microseconds"
attribution_granularity = "per_user"
budget_enforcement = true
forecast_horizon_days = 30
multi_cloud_analysis = true
[pricing]
compute_cost_per_vcpu_hour = 0.045 # USD
storage_cost_per_gb_month = 0.10 # USD
network_cost_per_gb = 0.02 # USD

cost-optimizer-v2 Configuration

# heliosdb.conf
[query_optimizer]
work_mem = "512MB"
effective_cache_size = "8GB"
max_parallel_workers_per_gather = 16
random_page_cost = 1.1 # SSD-optimized
enable_hashjoin = true
enable_mergejoin = true
enable_nestloop = true
statistics_target = 100
cardinality_accuracy_target = 0.95

Future Considerations

Potential Extraction (If Overlap Grows)

Monitor: If overlap grows beyond 40%, consider extracting common modules.

Candidate for Extraction:

heliosdb-cost-common/
├─ budget.rs (unified budget management)
├─ forecasting.rs (unified forecasting)
└─ types.rs (shared types)

Current Status: Not needed (overlap is only 20%)

Potential Rename

Proposed:

heliosdb-cost-management → heliosdb-cost-management (keep)
heliosdb-cost-optimizer-v2 → heliosdb-query-optimizer (rename)

Benefits:

  • Clearer purpose
  • No version confusion
  • Better discoverability

Effort: 1-2 hours (mechanical rename)


Summary

Quick Reference

QuestionAnswer
Should I consolidate the crates?❌ No - different purposes
Which crate for financial tracking?cost-optimizer
Which crate for query optimization?cost-optimizer-v2
Can I use both together?Yes - complementary
Which is more actively used?cost-optimizer-v2 (17 vs 4 refs)
Should I rename cost-optimizer-v2?Optional but recommended

Key Takeaways

  1. Different Domains: Financial vs Query Optimization
  2. Minimal Overlap: Only 20% (budget/forecasting)
  3. Complementary: Can be used together
  4. Keep Separate: Clear separation of concerns
  5. Optional Rename: Consider query-optimizer for clarity

Document Version: 1.0 Last Updated: November 11, 2025 Authors: HeliosDB Development Team Related Documents: