Conversational BI - Quick Reference
Conversational BI - Quick Reference
Developer Cheat Sheet
Architecture at a Glance
User Query → Session Mgr → Context Tracker → NL2SQL Engine → SQL Validator ↓ Query Cache Check ↓ [Cache Hit] → Return Cached SQL [Cache Miss] → Generate via LLM ↓ Schema Augmenter Example Store (Few-shot) Prompt Builder ↓ Model Router (GPT-4/Claude/Ollama) ↓ SQL Validation Self-Correction (if needed) ↓ Query Execution ↓ Explanation Engine Optimizer Suggester ↓ Response AssemblyCore Components
| Component | Responsibility | Key Metrics |
|---|---|---|
| SessionManager | Manage conversation lifecycle | 1,000 concurrent sessions |
| ContextTracker | Multi-turn state management | 10+ turn retention |
| Nl2SqlEngine | Generate SQL from NL | 95%+ accuracy |
| SchemaAugmenter | Enhance schema semantics | 20-30% accuracy gain |
| ExplanationEngine | NL explanations | 100% query coverage |
| ModelRouter | LLM selection | <100ms routing |
| QueryCache | Semantic caching | 80%+ hit rate |
API Examples
REST API
# Create sessioncurl -X POST http://localhost:8080/api/v1/conversational-bi/sessions \ -H "Content-Type: application/json" \ -d '{"user_id": "user123", "database": "sales_db"}'
# Response: {"session_id": "sess_abc123"}
# Ask questioncurl -X POST http://localhost:8080/api/v1/conversational-bi/query \ -H "Content-Type: application/json" \ -d '{ "session_id": "sess_abc123", "query": "Show me top 10 customers by revenue" }'
# Response:{ "turn_id": 1, "sql": "SELECT customer_name, SUM(amount) as revenue FROM ...", "explanation": "This query finds the top 10 customers...", "results": { "columns": [...], "rows": [...] }, "confidence": 0.95, "suggestions": ["Add index on customers(id)..."], "latency_ms": 1850}SQL Interface
-- Start sessionSELECT conversational_bi.start_session('sales_db');-- Returns: sess_abc123
-- Ask questionSELECT * FROM conversational_bi.ask( 'sess_abc123', 'Show me top 10 customers by revenue');-- Returns: SQL, explanation, results
-- Get historySELECT * FROM conversational_bi.history('sess_abc123');Python SDK
from heliosdb import ConversationalBI
# Initializebi = ConversationalBI(connection_string="postgresql://localhost/sales_db")
# Start sessionsession = bi.start_session(user_id="user123")
# Ask questionsresponse = session.ask("Show me top 10 customers by revenue")print(response.sql)print(response.explanation)print(response.results.to_dataframe())
# Follow-up (uses context)response = session.ask("Which region had the highest?")print(response.sql) # Incorporates previous query contextConfiguration
conversational_bi: session: max_sessions_per_user: 10 idle_timeout_minutes: 30 max_turns: 20
models: primary: provider: openai model: gpt-4 api_key: ${OPENAI_API_KEY}
fallback: provider: ollama model: llama3 endpoint: http://localhost:11434
cache: enable_query_cache: true query_cache_size: 10000 query_cache_ttl_hours: 1
performance: max_concurrent_sessions: 1000 latency_target_ms: 2000Testing Commands
# Run accuracy benchmarkscargo test --package heliosdb-conversational-bi --test accuracy_tests
# Run BIRD benchmarkcargo test --package heliosdb-conversational-bi bird_benchmark -- --nocapture
# Run performance benchmarkscargo bench --package heliosdb-conversational-bi
# Run specific testcargo test --package heliosdb-conversational-bi test_multi_turn_conversationPerformance Tuning
Latency Optimization
// Enable aggressive cachinglet config = ConversationalConfig { enable_query_cache: true, enable_schema_cache: true, query_cache_size: 100000, // Increase cache size ..Default::default()};
// Use faster model for simple querieslet requirements = ModelRequirements { max_latency_ms: Some(1000), // Force fast model ..Default::default()};Accuracy Optimization
// Use best model for complex querieslet requirements = ModelRequirements { min_accuracy: Some(0.95), require_local: false, // Allow cloud models ..Default::default()};
// Load more few-shot examplesexample_store.load_benchmark_examples("BIRD").await?;example_store.load_benchmark_examples("Spider").await?;Troubleshooting
Low Accuracy
Problem: Generated SQL is incorrect
Solutions:
- Check schema augmentation:
SchemaAugmenter::augment() - Add more few-shot examples:
ExampleStore::add_example() - Use higher-quality model: Switch to GPT-4
- Enable self-correction: Increase
max_correction_attempts
High Latency
Problem: Query generation takes >5s
Solutions:
- Check cache hit rate:
QueryCache::hit_rate() - Reduce schema size:
SchemaAugmenter::filter_relevant_schema() - Use smaller model: Switch to Claude Haiku or GPT-3.5-turbo
- Optimize prompt: Reduce few-shot examples from 5 to 3
Context Loss
Problem: Multi-turn references not resolved
Solutions:
- Check context window: Increase
context_window_tokens - Verify entity tracking:
ContextTracker::get_entities() - Check reference resolver:
ReferenceResolver::resolve()
Metrics to Monitor
Accuracy Metrics
// Prometheus metricsconversational_bi_accuracy_total{benchmark="BIRD"} 0.95conversational_bi_accuracy_total{benchmark="Spider"} 0.98conversational_bi_accuracy_total{benchmark="WikiSQL"} 0.99Performance Metrics
conversational_bi_query_latency_seconds_bucket{le="2.0"} 0.85 // 85% under 2sconversational_bi_query_latency_seconds_bucket{le="5.0"} 0.99 // 99% under 5sconversational_bi_cache_hit_rate 0.83 // 83% cache hitsconversational_bi_throughput_qps 120 // 120 queries/secError Metrics
conversational_bi_validation_failures_total 12conversational_bi_correction_attempts_total 45conversational_bi_timeout_errors_total 3Common Patterns
Multi-Turn Conversation
# Turn 1: Initial queryresponse = session.ask("Show sales by region")# SQL: SELECT region, SUM(amount) FROM sales GROUP BY region
# Turn 2: Reference previousresponse = session.ask("Which had the highest?")# SQL: SELECT region, SUM(amount) as total# FROM sales GROUP BY region# ORDER BY total DESC LIMIT 1
# Turn 3: Drill downresponse = session.ask("Show that by month")# SQL: SELECT DATE_TRUNC('month', date) as month,# region, SUM(amount)# FROM sales# WHERE region = 'West' -- From previous context# GROUP BY month, regionHandling Ambiguity
response = session.ask("Show revenue")# Response includes clarifications:# {# "clarifications": [# "Which time period? (This month | Last quarter | This year)",# "Which product line? (All | Electronics | Clothing)"# ]# }
# User provides clarificationresponse = session.ask("Last quarter for electronics")# Now generates specific SQLSchema-Specific Queries
# System automatically maps business terms to schemaresponse = session.ask("Show customer lifetime value")# Maps "customer lifetime value" → SUM(orders.amount) GROUP BY customers.id
response = session.ask("Find churned users")# Maps "churned" → last_login_date < NOW() - INTERVAL '90 days'Performance Benchmarks
| Benchmark | Target | Expected | Status |
|---|---|---|---|
| BIRD Accuracy | 95% | 95-97% | |
| Spider Accuracy | 98% | 98-99% | |
| Latency (p50) | <2s | 1.5-1.8s | |
| Latency (p99) | <5s | 3-4s | |
| Cache Hit Rate | 80% | 85-90% | |
| Throughput | 100 QPS | 120-150 QPS |
Dependencies
[dependencies]# HeliosDBheliosdb-common = { path = "../heliosdb-common" }heliosdb-compute = { path = "../heliosdb-compute" }heliosdb-ml = { path = "../heliosdb-ml" }
# LLM clientsasync-openai = "0.24"anthropic = "0.2"cohere-rust = "0.1"
# SQL parsingsqlparser = "0.51"
# Async runtimetokio = { version = "1.40", features = ["full"] }
# Cachingmoka = { version = "0.12", features = ["future"] }
# Embeddingsfastembed = "3.0"
# Monitoringprometheus = "0.13"tracing = "0.1"File Structure
heliosdb-conversational-bi/├── src/│ ├── lib.rs # Public API│ ├── engine.rs # Main engine│ ├── session/│ │ ├── manager.rs # Session lifecycle│ │ └── context.rs # Conversation context│ ├── nl2sql/│ │ ├── generator.rs # SQL generation│ │ ├── validator.rs # SQL validation│ │ └── corrector.rs # Self-correction│ ├── schema/│ │ ├── augmenter.rs # Schema enhancement│ │ └── cache.rs # Schema caching│ ├── context/│ │ ├── tracker.rs # Context tracking│ │ └── reference.rs # Reference resolution│ ├── explanation/│ │ └── explainer.rs # Query explanation│ └── models/│ ├── router.rs # Model routing│ ├── openai.rs # OpenAI client│ └── ollama.rs # Ollama client├── tests/│ ├── accuracy_tests.rs # BIRD, Spider benchmarks│ ├── multi_turn_tests.rs # Context tests│ └── performance_tests.rs # Latency/throughput└── benches/ ├── accuracy_benchmarks.rs └── latency_benchmarks.rsQuick Links
- Architecture: CONVERSATIONAL_BI_ARCHITECTURE.md
- Implementation: CONVERSATIONAL_BI_IMPLEMENTATION_GUIDE.md
- Summary: CONVERSATIONAL_BI_ARCHITECTURE_SUMMARY.md
Last Updated: November 9, 2025