Conversational BI - Quick Reference

Developer Cheat Sheet

Architecture at a Glance

User Query → Session Mgr → Context Tracker → NL2SQL Engine → SQL Validator
                                                    ↓
                                            Query Cache Check
                                                    ↓
                                    [Cache Hit] → Return Cached SQL
                                    [Cache Miss] → Generate via LLM
                                                    ↓
                                            Schema Augmenter
                                            Example Store (Few-shot)
                                            Prompt Builder
                                                    ↓
                                            Model Router
                                            (GPT-4/Claude/Ollama)
                                                    ↓
                                            SQL Validation
                                            Self-Correction (if needed)
                                                    ↓
                                            Query Execution
                                                    ↓
                                            Explanation Engine
                                            Optimizer Suggester
                                                    ↓
                                            Response Assembly

Core Components

Component	Responsibility	Key Metrics
SessionManager	Manage conversation lifecycle	1,000 concurrent sessions
ContextTracker	Multi-turn state management	10+ turn retention
Nl2SqlEngine	Generate SQL from NL	95%+ accuracy
SchemaAugmenter	Enhance schema semantics	20-30% accuracy gain
ExplanationEngine	NL explanations	100% query coverage
ModelRouter	LLM selection	<100ms routing
QueryCache	Semantic caching	80%+ hit rate

API Examples

REST API

# Create session
curl -X POST http://localhost:8080/api/v1/conversational-bi/sessions \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user123", "database": "sales_db"}'

# Response: {"session_id": "sess_abc123"}

# Ask question
curl -X POST http://localhost:8080/api/v1/conversational-bi/query \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "sess_abc123",
    "query": "Show me top 10 customers by revenue"
  }'

# Response:
{
  "turn_id": 1,
  "sql": "SELECT customer_name, SUM(amount) as revenue FROM ...",
  "explanation": "This query finds the top 10 customers...",
  "results": { "columns": [...], "rows": [...] },
  "confidence": 0.95,
  "suggestions": ["Add index on customers(id)..."],
  "latency_ms": 1850
}

SQL Interface

-- Start session
SELECT conversational_bi.start_session('sales_db');
-- Returns: sess_abc123

-- Ask question
SELECT * FROM conversational_bi.ask(
  'sess_abc123',
  'Show me top 10 customers by revenue'
);
-- Returns: SQL, explanation, results

-- Get history
SELECT * FROM conversational_bi.history('sess_abc123');

Python SDK

from heliosdb import ConversationalBI

# Initialize
bi = ConversationalBI(connection_string="postgresql://localhost/sales_db")

# Start session
session = bi.start_session(user_id="user123")

# Ask questions
response = session.ask("Show me top 10 customers by revenue")
print(response.sql)
print(response.explanation)
print(response.results.to_dataframe())

# Follow-up (uses context)
response = session.ask("Which region had the highest?")
print(response.sql)  # Incorporates previous query context

Configuration

conversational_bi:
  session:
    max_sessions_per_user: 10
    idle_timeout_minutes: 30
    max_turns: 20

  models:
    primary:
      provider: openai
      model: gpt-4
      api_key: ${OPENAI_API_KEY}

    fallback:
      provider: ollama
      model: llama3
      endpoint: http://localhost:11434

  cache:
    enable_query_cache: true
    query_cache_size: 10000
    query_cache_ttl_hours: 1

  performance:
    max_concurrent_sessions: 1000
    latency_target_ms: 2000

Testing Commands

# Run accuracy benchmarks
cargo test --package heliosdb-conversational-bi --test accuracy_tests

# Run BIRD benchmark
cargo test --package heliosdb-conversational-bi bird_benchmark -- --nocapture

# Run performance benchmarks
cargo bench --package heliosdb-conversational-bi

# Run specific test
cargo test --package heliosdb-conversational-bi test_multi_turn_conversation

Performance Tuning

Latency Optimization

// Enable aggressive caching
let config = ConversationalConfig {
    enable_query_cache: true,
    enable_schema_cache: true,
    query_cache_size: 100000,  // Increase cache size
    ..Default::default()
};

// Use faster model for simple queries
let requirements = ModelRequirements {
    max_latency_ms: Some(1000),  // Force fast model
    ..Default::default()
};

Accuracy Optimization

// Use best model for complex queries
let requirements = ModelRequirements {
    min_accuracy: Some(0.95),
    require_local: false,  // Allow cloud models
    ..Default::default()
};

// Load more few-shot examples
example_store.load_benchmark_examples("BIRD").await?;
example_store.load_benchmark_examples("Spider").await?;

Troubleshooting

Low Accuracy

Problem: Generated SQL is incorrect

Solutions:

Check schema augmentation: SchemaAugmenter::augment()
Add more few-shot examples: ExampleStore::add_example()
Use higher-quality model: Switch to GPT-4
Enable self-correction: Increase max_correction_attempts

High Latency

Problem: Query generation takes >5s

Solutions:

Check cache hit rate: QueryCache::hit_rate()
Reduce schema size: SchemaAugmenter::filter_relevant_schema()
Use smaller model: Switch to Claude Haiku or GPT-3.5-turbo
Optimize prompt: Reduce few-shot examples from 5 to 3

Context Loss

Problem: Multi-turn references not resolved

Solutions:

Check context window: Increase context_window_tokens
Verify entity tracking: ContextTracker::get_entities()
Check reference resolver: ReferenceResolver::resolve()

Metrics to Monitor

Accuracy Metrics

// Prometheus metrics
conversational_bi_accuracy_total{benchmark="BIRD"} 0.95
conversational_bi_accuracy_total{benchmark="Spider"} 0.98
conversational_bi_accuracy_total{benchmark="WikiSQL"} 0.99

Performance Metrics

conversational_bi_query_latency_seconds_bucket{le="2.0"} 0.85  // 85% under 2s
conversational_bi_query_latency_seconds_bucket{le="5.0"} 0.99  // 99% under 5s
conversational_bi_cache_hit_rate 0.83  // 83% cache hits
conversational_bi_throughput_qps 120   // 120 queries/sec

Error Metrics

conversational_bi_validation_failures_total 12
conversational_bi_correction_attempts_total 45
conversational_bi_timeout_errors_total 3

Common Patterns

Multi-Turn Conversation

# Turn 1: Initial query
response = session.ask("Show sales by region")
# SQL: SELECT region, SUM(amount) FROM sales GROUP BY region

# Turn 2: Reference previous
response = session.ask("Which had the highest?")
# SQL: SELECT region, SUM(amount) as total
#      FROM sales GROUP BY region
#      ORDER BY total DESC LIMIT 1

# Turn 3: Drill down
response = session.ask("Show that by month")
# SQL: SELECT DATE_TRUNC('month', date) as month,
#             region, SUM(amount)
#      FROM sales
#      WHERE region = 'West'  -- From previous context
#      GROUP BY month, region

Handling Ambiguity

response = session.ask("Show revenue")
# Response includes clarifications:
# {
#   "clarifications": [
#     "Which time period? (This month | Last quarter | This year)",
#     "Which product line? (All | Electronics | Clothing)"
#   ]
# }

# User provides clarification
response = session.ask("Last quarter for electronics")
# Now generates specific SQL

Schema-Specific Queries

# System automatically maps business terms to schema
response = session.ask("Show customer lifetime value")
# Maps "customer lifetime value" → SUM(orders.amount) GROUP BY customers.id

response = session.ask("Find churned users")
# Maps "churned" → last_login_date < NOW() - INTERVAL '90 days'

Performance Benchmarks

Benchmark	Target	Expected
BIRD Accuracy	95%	95-97%
Spider Accuracy	98%	98-99%
Latency (p50)	<2s	1.5-1.8s
Latency (p99)	<5s	3-4s
Cache Hit Rate	80%	85-90%
Throughput	100 QPS	120-150 QPS

Dependencies

[dependencies]
# HeliosDB
heliosdb-common = { path = "../heliosdb-common" }
heliosdb-compute = { path = "../heliosdb-compute" }
heliosdb-ml = { path = "../heliosdb-ml" }

# LLM clients
async-openai = "0.24"
anthropic = "0.2"
cohere-rust = "0.1"

# SQL parsing
sqlparser = "0.51"

# Async runtime
tokio = { version = "1.40", features = ["full"] }

# Caching
moka = { version = "0.12", features = ["future"] }

# Embeddings
fastembed = "3.0"

# Monitoring
prometheus = "0.13"
tracing = "0.1"

File Structure

heliosdb-conversational-bi/
├── src/
│   ├── lib.rs                    # Public API
│   ├── engine.rs                 # Main engine
│   ├── session/
│   │   ├── manager.rs            # Session lifecycle
│   │   └── context.rs            # Conversation context
│   ├── nl2sql/
│   │   ├── generator.rs          # SQL generation
│   │   ├── validator.rs          # SQL validation
│   │   └── corrector.rs          # Self-correction
│   ├── schema/
│   │   ├── augmenter.rs          # Schema enhancement
│   │   └── cache.rs              # Schema caching
│   ├── context/
│   │   ├── tracker.rs            # Context tracking
│   │   └── reference.rs          # Reference resolution
│   ├── explanation/
│   │   └── explainer.rs          # Query explanation
│   └── models/
│       ├── router.rs             # Model routing
│       ├── openai.rs             # OpenAI client
│       └── ollama.rs             # Ollama client
├── tests/
│   ├── accuracy_tests.rs         # BIRD, Spider benchmarks
│   ├── multi_turn_tests.rs       # Context tests
│   └── performance_tests.rs      # Latency/throughput
└── benches/
    ├── accuracy_benchmarks.rs
    └── latency_benchmarks.rs

Quick Links

Architecture: CONVERSATIONAL_BI_ARCHITECTURE.md
Implementation: CONVERSATIONAL_BI_IMPLEMENTATION_GUIDE.md
Summary: CONVERSATIONAL_BI_ARCHITECTURE_SUMMARY.md

Last Updated: November 9, 2025