Why Not Analysis - Quick Reference Guide
Why Not Analysis - Quick Reference Guide
Feature: Query Optimizer “Why Not” Analysis
Version: 7.0.0
Module: heliosdb-compute::optimizer::why_not
Overview
The “Why Not” analyzer explains why the query optimizer didn’t apply certain optimizations, helping you understand and improve query performance.
Quick Start
use heliosdb_compute::optimizer::{WhyNotAnalyzer, WhyNotTableStatistics};use heliosdb_compute::planner::{IndexMetadata, IndexType, PhysicalPlan};
// 1. Create analyzerlet mut analyzer = WhyNotAnalyzer::new();
// 2. Register indexesanalyzer.register_index(IndexMetadata { name: "idx_users_email".to_string(), table_id: 1, index_type: IndexType::BTree, columns: vec![0], included_columns: vec![], is_covering: false,});
// 3. Register statisticsanalyzer.register_statistics( "users".to_string(), WhyNotTableStatistics { last_analyzed: "2024-10-01".to_string(), days_since_analysis: 45, row_count: 1000000, percent_modified: 38.5, },);
// 4. Analyze planlet analysis = analyzer.analyze(&plan, &predicates);
// 5. View resultsprintln!("{}", analysis.to_text());Analysis Categories
1. Index Usage Analysis
Detects: Why available indexes weren’t used
Common Reasons:
- ❌ No WHERE clause predicates
- ❌ LIKE with leading wildcard (
%search) - ❌ Hash index used for range queries
- ❌ Low selectivity (seq scan cheaper)
Example Output:
INDEX USAGE ANALYSIS (1 indexes)────────────────────────────────────────────────────────────
1. Index: idx_users_email Table: users | Type: BTree | Severity: High Reason: LIKE with leading wildcard '%search' Suggestion: Use full-text search or trigram index 💰 Cost Impact: 40.0x slower (50.00 vs 2000.00)2. Statistics Freshness
Detects: Stale table statistics affecting plans
Severity Levels:
- 🟢 Low: 7-14 days old
- 🟡 Medium: 14-30 days old
- 🔴 High: > 30 days old
Example Output:
STATISTICS WARNINGS (1 tables)────────────────────────────────────────────────────────────
1. Table: orders Last Updated: 2024-10-01 | Severity: High Freshness: 45 days old (38.5% modified) Action: ANALYZE orders;3. Cardinality Mismatches
Detects: Large differences between estimated and actual row counts
Severity Thresholds:
- 🟢 Low: < 100% error
- 🟡 Medium: 100-500% error
- 🟠 High: 500-1000% error
- 🔴 Critical: > 1000% error
Example Output:
CARDINALITY MISMATCHES (1 detected)────────────────────────────────────────────────────────────
1. Operation: Hash Join (customers JOIN orders) Estimated: 1000 rows | Actual: 150000 rows Error Rate: 14900.0% (150.0x underestimate) Likely Cause: Correlated predicates not in statistics Fix: Create extended statistics4. Optimization Suggestions
Provides: Actionable recommendations with implementation steps
Categories:
- Indexing: Create, modify, or drop indexes
- Statistics: Update or create statistics
- Schema: Schema modifications
- Configuration: Parameter tuning
- QueryRewrite: Query restructuring
Priority Levels: Critical > High > Medium > Low
Effort Estimates:
- Trivial: < 5 minutes
- 🔨 Low: 5-30 minutes
- Medium: 30 min - 2 hours
- 🏗 High: 2-8 hours
- 🏭 VeryHigh: > 8 hours
Example Output:
OPTIMIZATION SUGGESTIONS (1 recommendations)────────────────────────────────────────────────────────────
1. Create covering index for frequent query Category: Indexing | Priority: High | Effort: Low Description: Query executed 50K times/day Expected Benefit: 90.0% improvement (1000.00 -> 100.00) Implementation: • CREATE INDEX idx_users_status ON users(status) INCLUDE (name, email); • ANALYZE users; • Test with EXPLAIN ANALYZEIntegration with EXPLAIN
use heliosdb_compute::optimizer::ExplainOutput;
// Create EXPLAIN outputlet mut explain = ExplainOutput::new(plan, cost, rows);
// Run why_not analysislet analysis = analyzer.analyze(&plan, &predicates);
// Attach to EXPLAINexplain.set_why_not_analysis(analysis);
// Print complete outputprintln!("{}", explain.to_text());Output:
EXPLAIN PLAN (Enhanced)Estimated Cost: 1000.00Estimated Rows: 50000
[... plan details ...]
════════════════════════════════════════════════════════════WHY NOT ANALYSISConfidence: 100.0%Total Issues: 3
[... detailed analysis ...]Common Patterns
Pattern 1: Missing Index
// Detected: TableScan with predicates, no indexes// Suggestion: CREATE INDEX idx_table_col ON table(column);// Benefit: 10-100x improvementPattern 2: Sort Elimination
// Detected: Sort -> TableScan// Suggestion: CREATE INDEX ON table(order_by_column);// Benefit: Eliminates expensive sort operationPattern 3: Covering Index
// Detected: HashAggregate -> TableScan// Suggestion: CREATE INDEX ON table(group_by) INCLUDE (agg_columns);// Benefit: Index-only scan, reduced I/OPattern 4: Hash Index Limitation
// Detected: Hash index with range predicates// Suggestion: ALTER INDEX USING BTREE;// Benefit: Enables range query supportAPI Reference
WhyNotAnalyzer
impl WhyNotAnalyzer { pub fn new() -> Self; pub fn register_index(&mut self, index: IndexMetadata); pub fn register_statistics(&mut self, table: String, stats: TableStatistics); pub fn analyze(&self, plan: &PhysicalPlan, predicates: &[Predicate]) -> WhyNotAnalysis;}WhyNotAnalysis
impl WhyNotAnalysis { pub fn new() -> Self; pub fn total_issues(&self) -> usize; pub fn has_critical_issues(&self) -> bool; pub fn to_text(&self) -> String; pub fn add_index_reason(&mut self, reason: IndexNotUsedReason); pub fn add_statistics_warning(&mut self, warning: StatisticsWarning); pub fn add_cardinality_mismatch(&mut self, mismatch: CardinalityMismatch); pub fn add_suggestion(&mut self, suggestion: OptimizationSuggestion);}Key Types
pub enum Severity { Low, Medium, High, Critical,}
pub enum SuggestionCategory { Indexing, Statistics, Schema, Configuration, QueryRewrite,}
pub enum Priority { Low, Medium, High, Critical,}
pub enum Effort { Trivial, // < 5 min Low, // 5-30 min Medium, // 30 min - 2 hours High, // 2-8 hours VeryHigh, // > 8 hours}Checking Results
let analysis = analyzer.analyze(&plan, &predicates);
// Check for critical issuesif analysis.has_critical_issues() { eprintln!("⚠ Critical issues detected!");}
// Count issuesprintln!("Total issues found: {}", analysis.total_issues());
// Iterate over findingsfor reason in &analysis.index_analysis { if reason.severity >= Severity::High { println!("High priority: {}", reason.reason); }}
// Get suggestions by prioritylet high_priority: Vec<_> = analysis.suggestions .iter() .filter(|s| s.priority >= Priority::High) .collect();Best Practices
1. Register All Available Indexes
for index in database.get_indexes() { analyzer.register_index(index);}2. Keep Statistics Current
analyzer.register_statistics( table_name, WhyNotTableStatistics { last_analyzed: get_last_analyze_time(), days_since_analysis: calculate_days(), row_count: get_row_count(), percent_modified: calculate_modifications(), },);3. Act on High-Priority Suggestions
for suggestion in analysis.suggestions { if suggestion.priority >= Priority::High && suggestion.effort <= Effort::Low { // Quick wins - implement immediately execute_suggestion(suggestion); }}4. Monitor Statistics Freshness
for warning in analysis.statistics_warnings { if warning.days_since_analysis > 30 { // Schedule ANALYZE job schedule_analyze(warning.table_name); }}Troubleshooting
Issue: No suggestions generated
Cause: Plan is already optimal or analyzer lacks metadata Fix: Register indexes and statistics
Issue: Too many false positives
Cause: Statistics not registered Fix: Provide accurate statistics metadata
Issue: Missing cost impact
Cause: Cost model not calibrated Fix: Update cost model parameters
Performance Tips
- Analysis Overhead: < 1ms for most queries
- Memory Usage: ~1KB per finding
- Scalability: Handles 100+ node plans efficiently
Example Workflow
// 1. Setup analyzerlet mut analyzer = WhyNotAnalyzer::new();
// 2. Load metadatafor index in load_indexes() { analyzer.register_index(index);}for (table, stats) in load_statistics() { analyzer.register_statistics(table, stats);}
// 3. Analyze querylet plan = optimizer.create_plan(query)?;let analysis = analyzer.analyze(&plan, &predicates);
// 4. Review findingsif analysis.has_critical_issues() { log::error!("Critical optimization issues detected");}
// 5. Generate reportlet report = analysis.to_text();send_to_dba(report);
// 6. Apply suggestionsfor suggestion in analysis.suggestions { if suggestion.priority == Priority::Critical { apply_suggestion_immediately(suggestion); }}Testing
# Run all testscargo test --package heliosdb-compute why_not_analysis_tests
# Run with outputcargo test --package heliosdb-compute why_not_analysis_tests -- --nocapture
# Run specific testcargo test --package heliosdb-compute test_why_not_analysis_basicFurther Reading
- Full Report:
/home/claude/HeliosDB/docs/implementation/AGENT_8_WEEK_4_WHY_NOT_ANALYSIS_REPORT.md - Source Code:
/home/claude/HeliosDB/heliosdb-compute/src/optimizer/why_not.rs - Tests:
/home/claude/HeliosDB/heliosdb-compute/tests/why_not_analysis_tests.rs
Quick Reference Version: 1.0 Last Updated: November 14, 2025