HeliosDB Nano Advanced Features Tutorial
HeliosDB Nano Advanced Features Tutorial
Hands-on guide to EXPLAIN, Vector Search, Materialized Views, and Bulk Loading.
Prerequisites: HeliosDB Nano REPL running (./heliosdb-nano repl -m for in-memory mode)
See also: Storage Modes Tutorial for per-column storage optimization.
Table of Contents
- Query Analysis with EXPLAIN
- Vector Search with HNSW Indexes
- Materialized Views
- Bulk Loading for Performance
1. Query Analysis with EXPLAIN
Basic EXPLAIN
CREATE TABLE events ( event_id INT PRIMARY KEY, event_type TEXT STORAGE DICTIONARY, response_time_ms INT STORAGE COLUMNAR);
INSERT INTO events VALUES (1, 'page_view', 45);INSERT INTO events VALUES (2, 'api_call', 23);INSERT INTO events VALUES (3, 'page_view', 67);
EXPLAIN SELECT event_type, COUNT(*) FROM events GROUP BY event_type;Shows the query execution plan.
EXPLAIN Modes
EXPLAIN (VERBOSE) SELECT event_type, AVG(response_time_ms) FROM events GROUP BY event_type;Verbose mode shows cost estimates and cardinality.
EXPLAIN (ANALYZE) SELECT * FROM events WHERE event_type = 'page_view';Analyze mode executes the query and shows actual timing.
Output Formats
EXPLAIN (FORMAT JSON) SELECT * FROM events;
EXPLAIN (FORMAT YAML) SELECT event_type, SUM(response_time_ms) FROM events GROUP BY event_type;Machine-readable formats for programmatic analysis.
DROP TABLE events;2. Vector Search with HNSW Indexes
Create a Knowledge Base
CREATE TABLE knowledge_articles ( article_id INT PRIMARY KEY, title TEXT, content TEXT STORAGE CONTENT_ADDRESSED, category TEXT STORAGE DICTIONARY, embedding VECTOR(4));Insert Articles with Embeddings
INSERT INTO knowledge_articles VALUES ( 1, 'Getting Started Guide', 'Welcome to our platform. This guide helps you get started.', 'onboarding', '[0.1, 0.8, 0.2, 0.5]');
INSERT INTO knowledge_articles VALUES ( 2, 'Password Reset Help', 'To reset your password, click Forgot Password on the login page.', 'security', '[0.3, 0.2, 0.9, 0.1]');
INSERT INTO knowledge_articles VALUES ( 3, 'Billing FAQ', 'Common questions about billing, invoices, and payment methods.', 'billing', '[0.7, 0.1, 0.3, 0.8]');
INSERT INTO knowledge_articles VALUES ( 4, 'API Documentation', 'Complete reference for REST API endpoints and authentication.', 'developer', '[0.2, 0.6, 0.4, 0.7]');
INSERT INTO knowledge_articles VALUES ( 5, 'Account Settings', 'Manage your profile, notifications, and privacy settings.', 'onboarding', '[0.15, 0.75, 0.25, 0.55]');Create HNSW Index
CREATE INDEX idx_kb_embedding ON knowledge_articles USING hnsw (embedding);Semantic Search
Find articles similar to a query embedding (e.g., “how do I change my password”):
SELECT article_id, title, category, embedding <-> '[0.25, 0.3, 0.85, 0.15]' as distanceFROM knowledge_articlesORDER BY embedding <-> '[0.25, 0.3, 0.85, 0.15]'LIMIT 3;Output:
┌────────────┬───────────────────────┬──────────┬──────────┐│ article_id │ title │ category │ distance │├────────────┼───────────────────────┼──────────┼──────────┤│ 2 │ Password Reset Help │ security │ 0.122 ││ 4 │ API Documentation │ developer│ 0.548 ││ 5 │ Account Settings │ onboarding│ 0.612 │└────────────┴───────────────────────┴──────────┴──────────┘Distance Operators
<->L2 distance (Euclidean)<=>Cosine distance<#>Inner product
SELECT title, category, embedding <=> '[0.1, 0.7, 0.3, 0.5]' as cosine_distanceFROM knowledge_articlesWHERE category = 'onboarding'ORDER BY cosine_distanceLIMIT 2;DROP TABLE knowledge_articles;3. Materialized Views
Create Base Table
CREATE TABLE sales ( sale_id INT PRIMARY KEY, product_id INT, category TEXT STORAGE DICTIONARY, amount FLOAT8 STORAGE COLUMNAR, sale_date INT8 STORAGE COLUMNAR);
INSERT INTO sales VALUES (1, 101, 'electronics', 299.99, 1705600000);INSERT INTO sales VALUES (2, 102, 'electronics', 149.99, 1705600100);INSERT INTO sales VALUES (3, 201, 'furniture', 599.99, 1705600200);INSERT INTO sales VALUES (4, 101, 'electronics', 299.99, 1705600300);INSERT INTO sales VALUES (5, 301, 'clothing', 49.99, 1705600400);INSERT INTO sales VALUES (6, 102, 'electronics', 149.99, 1705600500);INSERT INTO sales VALUES (7, 201, 'furniture', 599.99, 1705600600);INSERT INTO sales VALUES (8, 302, 'clothing', 79.99, 1705600700);Create Materialized View
CREATE MATERIALIZED VIEW category_summary ASSELECT category, COUNT(*) as sale_count, SUM(amount) as total_revenue, AVG(amount) as avg_saleFROM salesGROUP BY category;Query the View
SELECT * FROM category_summary ORDER BY total_revenue DESC;Output:
┌─────────────┬────────────┬───────────────┬──────────┐│ category │ sale_count │ total_revenue │ avg_sale │├─────────────┼────────────┼───────────────┼──────────┤│ furniture │ 2 │ 1199.98 │ 599.99 ││ electronics │ 4 │ 899.96 │ 224.99 ││ clothing │ 2 │ 129.98 │ 64.99 │└─────────────┴────────────┴───────────────┴──────────┘Refresh After Changes
INSERT INTO sales VALUES (9, 103, 'electronics', 999.99, 1705600800);
REFRESH MATERIALIZED VIEW category_summary;
SELECT * FROM category_summary ORDER BY total_revenue DESC;Product-Level View
CREATE MATERIALIZED VIEW product_summary ASSELECT product_id, category, COUNT(*) as times_sold, SUM(amount) as total_revenueFROM salesGROUP BY product_id, category;
SELECT * FROM product_summary ORDER BY total_revenue DESC;DROP MATERIALIZED VIEW product_summary;DROP MATERIALIZED VIEW category_summary;DROP TABLE sales;4. Bulk Loading for Performance
Create Table with Optimized Storage
CREATE TABLE bulk_events ( id INT PRIMARY KEY, event_type TEXT STORAGE DICTIONARY, region TEXT STORAGE DICTIONARY, timestamp INT8 STORAGE COLUMNAR, value FLOAT8 STORAGE COLUMNAR);Insert Data
INSERT INTO bulk_events VALUES (1, 'click', 'us-east', 1705700000, 1.5);INSERT INTO bulk_events VALUES (2, 'click', 'us-west', 1705700001, 2.3);INSERT INTO bulk_events VALUES (3, 'view', 'eu-west', 1705700002, 0.8);INSERT INTO bulk_events VALUES (4, 'click', 'us-east', 1705700003, 1.9);INSERT INTO bulk_events VALUES (5, 'purchase', 'us-east', 1705700004, 99.99);INSERT INTO bulk_events VALUES (6, 'view', 'ap-south', 1705700005, 0.5);INSERT INTO bulk_events VALUES (7, 'click', 'eu-west', 1705700006, 2.1);INSERT INTO bulk_events VALUES (8, 'view', 'us-west', 1705700007, 0.7);INSERT INTO bulk_events VALUES (9, 'purchase', 'eu-west', 1705700008, 149.99);INSERT INTO bulk_events VALUES (10, 'click', 'ap-south', 1705700009, 1.2);Note: For maximum performance (500K-700K rows/sec), use the Rust API:
// Enable bulk mode via APIdb.set_bulk_load_mode(true);// ... inserts ...db.set_bulk_load_mode(false);
// Or use direct_bulk_load for 700K-900K rows/secdb.direct_bulk_load("bulk_events", &rows, batch_size)?;Analyze Results
SELECT event_type, region, COUNT(*), SUM(value) as total_valueFROM bulk_eventsGROUP BY event_type, regionORDER BY total_value DESC;Output:
┌───────────┬──────────┬───────┬─────────────┐│ group_0 │ group_1 │ agg_0 │ agg_1 │├───────────┼──────────┼───────┼─────────────┤│ purchase │ eu-west │ 1 │ 149.99 ││ purchase │ us-east │ 1 │ 99.99 ││ click │ us-east │ 2 │ 3.4 ││ click │ us-west │ 1 │ 2.3 ││ click │ eu-west │ 1 │ 2.1 ││ click │ ap-south │ 1 │ 1.2 ││ view │ eu-west │ 1 │ 0.8 ││ view │ us-west │ 1 │ 0.7 ││ view │ ap-south │ 1 │ 0.5 │└───────────┴──────────┴───────┴─────────────┘Performance Comparison
| Method | Throughput |
|---|---|
| Standard INSERT | 30K-50K rows/sec |
| Bulk Load Mode | 500K-700K rows/sec |
| direct_bulk_load API | 700K-900K rows/sec |
DROP TABLE bulk_events;Combining Features
Real-World Example: Analytics Pipeline
CREATE TABLE raw_events ( id INT PRIMARY KEY, event_type TEXT STORAGE DICTIONARY, user_country TEXT STORAGE DICTIONARY, timestamp INT8 STORAGE COLUMNAR, value FLOAT8 STORAGE COLUMNAR, embedding VECTOR(4));
INSERT INTO raw_events VALUES (1, 'search', 'USA', 1705800000, 1.0, '[0.1, 0.2, 0.3, 0.4]');INSERT INTO raw_events VALUES (2, 'click', 'USA', 1705800001, 2.5, '[0.2, 0.3, 0.4, 0.5]');INSERT INTO raw_events VALUES (3, 'search', 'UK', 1705800002, 1.0, '[0.1, 0.25, 0.35, 0.45]');INSERT INTO raw_events VALUES (4, 'purchase', 'Germany', 1705800003, 99.99, '[0.8, 0.1, 0.2, 0.9]');INSERT INTO raw_events VALUES (5, 'click', 'France', 1705800004, 3.0, '[0.3, 0.4, 0.5, 0.6]');
CREATE INDEX idx_events_embedding ON raw_events USING hnsw (embedding);
CREATE MATERIALIZED VIEW event_summary ASSELECT event_type, user_country, COUNT(*), SUM(value), AVG(value)FROM raw_eventsGROUP BY event_type, user_country;
SELECT * FROM event_summary ORDER BY SUM(value) DESC;
SELECT id, event_type, embedding <-> '[0.15, 0.25, 0.35, 0.45]' as distanceFROM raw_eventsORDER BY distanceLIMIT 3;
EXPLAIN (VERBOSE) SELECT event_type, AVG(value) FROM raw_events GROUP BY event_type;
DROP MATERIALIZED VIEW event_summary;DROP TABLE raw_events;Quick Reference
EXPLAIN Options
| Option | Description |
|---|---|
VERBOSE | Show costs and cardinality |
ANALYZE | Execute and show timing |
FORMAT JSON | JSON output |
FORMAT YAML | YAML output |
Vector Distance Operators
| Operator | Distance Type |
|---|---|
<-> | L2 (Euclidean) |
<=> | Cosine |
<#> | Inner Product |
Materialized View Commands
CREATE MATERIALIZED VIEW name AS SELECT ...;REFRESH MATERIALIZED VIEW name;DROP MATERIALIZED VIEW name;Bulk Loading (Rust API)
db.set_bulk_load_mode(true);// INSERT statementsdb.set_bulk_load_mode(false);
// Or for maximum speed:db.direct_bulk_load("table", &rows, batch_size)?;See Storage Modes Tutorial for Dictionary, Content-Addressed, and Columnar storage optimization.