Skip to content

Conversational BI - Quick Reference

Conversational BI - Quick Reference

Developer Cheat Sheet


Architecture at a Glance

User Query → Session Mgr → Context Tracker → NL2SQL Engine → SQL Validator
Query Cache Check
[Cache Hit] → Return Cached SQL
[Cache Miss] → Generate via LLM
Schema Augmenter
Example Store (Few-shot)
Prompt Builder
Model Router
(GPT-4/Claude/Ollama)
SQL Validation
Self-Correction (if needed)
Query Execution
Explanation Engine
Optimizer Suggester
Response Assembly

Core Components

ComponentResponsibilityKey Metrics
SessionManagerManage conversation lifecycle1,000 concurrent sessions
ContextTrackerMulti-turn state management10+ turn retention
Nl2SqlEngineGenerate SQL from NL95%+ accuracy
SchemaAugmenterEnhance schema semantics20-30% accuracy gain
ExplanationEngineNL explanations100% query coverage
ModelRouterLLM selection<100ms routing
QueryCacheSemantic caching80%+ hit rate

API Examples

REST API

Terminal window
# Create session
curl -X POST http://localhost:8080/api/v1/conversational-bi/sessions \
-H "Content-Type: application/json" \
-d '{"user_id": "user123", "database": "sales_db"}'
# Response: {"session_id": "sess_abc123"}
# Ask question
curl -X POST http://localhost:8080/api/v1/conversational-bi/query \
-H "Content-Type: application/json" \
-d '{
"session_id": "sess_abc123",
"query": "Show me top 10 customers by revenue"
}'
# Response:
{
"turn_id": 1,
"sql": "SELECT customer_name, SUM(amount) as revenue FROM ...",
"explanation": "This query finds the top 10 customers...",
"results": { "columns": [...], "rows": [...] },
"confidence": 0.95,
"suggestions": ["Add index on customers(id)..."],
"latency_ms": 1850
}

SQL Interface

-- Start session
SELECT conversational_bi.start_session('sales_db');
-- Returns: sess_abc123
-- Ask question
SELECT * FROM conversational_bi.ask(
'sess_abc123',
'Show me top 10 customers by revenue'
);
-- Returns: SQL, explanation, results
-- Get history
SELECT * FROM conversational_bi.history('sess_abc123');

Python SDK

from heliosdb import ConversationalBI
# Initialize
bi = ConversationalBI(connection_string="postgresql://localhost/sales_db")
# Start session
session = bi.start_session(user_id="user123")
# Ask questions
response = session.ask("Show me top 10 customers by revenue")
print(response.sql)
print(response.explanation)
print(response.results.to_dataframe())
# Follow-up (uses context)
response = session.ask("Which region had the highest?")
print(response.sql) # Incorporates previous query context

Configuration

config/conversational_bi.yaml
conversational_bi:
session:
max_sessions_per_user: 10
idle_timeout_minutes: 30
max_turns: 20
models:
primary:
provider: openai
model: gpt-4
api_key: ${OPENAI_API_KEY}
fallback:
provider: ollama
model: llama3
endpoint: http://localhost:11434
cache:
enable_query_cache: true
query_cache_size: 10000
query_cache_ttl_hours: 1
performance:
max_concurrent_sessions: 1000
latency_target_ms: 2000

Testing Commands

Terminal window
# Run accuracy benchmarks
cargo test --package heliosdb-conversational-bi --test accuracy_tests
# Run BIRD benchmark
cargo test --package heliosdb-conversational-bi bird_benchmark -- --nocapture
# Run performance benchmarks
cargo bench --package heliosdb-conversational-bi
# Run specific test
cargo test --package heliosdb-conversational-bi test_multi_turn_conversation

Performance Tuning

Latency Optimization

// Enable aggressive caching
let config = ConversationalConfig {
enable_query_cache: true,
enable_schema_cache: true,
query_cache_size: 100000, // Increase cache size
..Default::default()
};
// Use faster model for simple queries
let requirements = ModelRequirements {
max_latency_ms: Some(1000), // Force fast model
..Default::default()
};

Accuracy Optimization

// Use best model for complex queries
let requirements = ModelRequirements {
min_accuracy: Some(0.95),
require_local: false, // Allow cloud models
..Default::default()
};
// Load more few-shot examples
example_store.load_benchmark_examples("BIRD").await?;
example_store.load_benchmark_examples("Spider").await?;

Troubleshooting

Low Accuracy

Problem: Generated SQL is incorrect

Solutions:

  1. Check schema augmentation: SchemaAugmenter::augment()
  2. Add more few-shot examples: ExampleStore::add_example()
  3. Use higher-quality model: Switch to GPT-4
  4. Enable self-correction: Increase max_correction_attempts

High Latency

Problem: Query generation takes >5s

Solutions:

  1. Check cache hit rate: QueryCache::hit_rate()
  2. Reduce schema size: SchemaAugmenter::filter_relevant_schema()
  3. Use smaller model: Switch to Claude Haiku or GPT-3.5-turbo
  4. Optimize prompt: Reduce few-shot examples from 5 to 3

Context Loss

Problem: Multi-turn references not resolved

Solutions:

  1. Check context window: Increase context_window_tokens
  2. Verify entity tracking: ContextTracker::get_entities()
  3. Check reference resolver: ReferenceResolver::resolve()

Metrics to Monitor

Accuracy Metrics

// Prometheus metrics
conversational_bi_accuracy_total{benchmark="BIRD"} 0.95
conversational_bi_accuracy_total{benchmark="Spider"} 0.98
conversational_bi_accuracy_total{benchmark="WikiSQL"} 0.99

Performance Metrics

conversational_bi_query_latency_seconds_bucket{le="2.0"} 0.85 // 85% under 2s
conversational_bi_query_latency_seconds_bucket{le="5.0"} 0.99 // 99% under 5s
conversational_bi_cache_hit_rate 0.83 // 83% cache hits
conversational_bi_throughput_qps 120 // 120 queries/sec

Error Metrics

conversational_bi_validation_failures_total 12
conversational_bi_correction_attempts_total 45
conversational_bi_timeout_errors_total 3

Common Patterns

Multi-Turn Conversation

# Turn 1: Initial query
response = session.ask("Show sales by region")
# SQL: SELECT region, SUM(amount) FROM sales GROUP BY region
# Turn 2: Reference previous
response = session.ask("Which had the highest?")
# SQL: SELECT region, SUM(amount) as total
# FROM sales GROUP BY region
# ORDER BY total DESC LIMIT 1
# Turn 3: Drill down
response = session.ask("Show that by month")
# SQL: SELECT DATE_TRUNC('month', date) as month,
# region, SUM(amount)
# FROM sales
# WHERE region = 'West' -- From previous context
# GROUP BY month, region

Handling Ambiguity

response = session.ask("Show revenue")
# Response includes clarifications:
# {
# "clarifications": [
# "Which time period? (This month | Last quarter | This year)",
# "Which product line? (All | Electronics | Clothing)"
# ]
# }
# User provides clarification
response = session.ask("Last quarter for electronics")
# Now generates specific SQL

Schema-Specific Queries

# System automatically maps business terms to schema
response = session.ask("Show customer lifetime value")
# Maps "customer lifetime value" → SUM(orders.amount) GROUP BY customers.id
response = session.ask("Find churned users")
# Maps "churned" → last_login_date < NOW() - INTERVAL '90 days'

Performance Benchmarks

BenchmarkTargetExpectedStatus
BIRD Accuracy95%95-97%
Spider Accuracy98%98-99%
Latency (p50)<2s1.5-1.8s
Latency (p99)<5s3-4s
Cache Hit Rate80%85-90%
Throughput100 QPS120-150 QPS

Dependencies

[dependencies]
# HeliosDB
heliosdb-common = { path = "../heliosdb-common" }
heliosdb-compute = { path = "../heliosdb-compute" }
heliosdb-ml = { path = "../heliosdb-ml" }
# LLM clients
async-openai = "0.24"
anthropic = "0.2"
cohere-rust = "0.1"
# SQL parsing
sqlparser = "0.51"
# Async runtime
tokio = { version = "1.40", features = ["full"] }
# Caching
moka = { version = "0.12", features = ["future"] }
# Embeddings
fastembed = "3.0"
# Monitoring
prometheus = "0.13"
tracing = "0.1"

File Structure

heliosdb-conversational-bi/
├── src/
│ ├── lib.rs # Public API
│ ├── engine.rs # Main engine
│ ├── session/
│ │ ├── manager.rs # Session lifecycle
│ │ └── context.rs # Conversation context
│ ├── nl2sql/
│ │ ├── generator.rs # SQL generation
│ │ ├── validator.rs # SQL validation
│ │ └── corrector.rs # Self-correction
│ ├── schema/
│ │ ├── augmenter.rs # Schema enhancement
│ │ └── cache.rs # Schema caching
│ ├── context/
│ │ ├── tracker.rs # Context tracking
│ │ └── reference.rs # Reference resolution
│ ├── explanation/
│ │ └── explainer.rs # Query explanation
│ └── models/
│ ├── router.rs # Model routing
│ ├── openai.rs # OpenAI client
│ └── ollama.rs # Ollama client
├── tests/
│ ├── accuracy_tests.rs # BIRD, Spider benchmarks
│ ├── multi_turn_tests.rs # Context tests
│ └── performance_tests.rs # Latency/throughput
└── benches/
├── accuracy_benchmarks.rs
└── latency_benchmarks.rs


Last Updated: November 9, 2025