Getting Started with HeliosDB Nano v2.0
Getting Started with HeliosDB Nano v2.0
Version: 2.0.0 Last Updated: November 21, 2025 Difficulty: Beginner Estimated Time: 30 minutes
Table of Contents
- Introduction
- Installation
- Quick Start (5-Minute Tutorial)
- Core Concepts
- Basic Operations
- Vector Search
- Time-Travel Queries
- Database Branching
- Materialized Views
- Next Steps
Introduction
What is HeliosDB Nano?
HeliosDB Nano is a PostgreSQL-compatible embedded database that brings enterprise-grade features to your applications without the complexity of a traditional database server. Think of it as SQLite with superpowers.
Key Features:
- Embedded Mode: Run directly in your application process (like SQLite)
- Server Mode: Optional PostgreSQL-compatible network server
- Vector Search: Built-in HNSW index for AI/ML embeddings (OpenAI, Cohere, etc.)
- Product Quantization: 8-16x vector compression with minimal accuracy loss
- Time-Travel Queries: Query historical data at any point in time
- Database Branching: Git-like workflows for zero-downtime deployments
- Built-in Encryption: AES-256-GCM with less than 3% overhead
- Multi-Tenant Ready: Native tenant isolation with Row-Level Security
Who Should Use HeliosDB Nano?
HeliosDB Nano is perfect for:
- AI/ML Applications: Need vector search for embeddings (RAG, semantic search)
- Edge Computing: Embedded database with server capabilities
- SaaS Applications: Multi-tenant isolation without complexity
- Mobile/Desktop Apps: Rich SQL database without external dependencies
- Prototyping: PostgreSQL compatibility means easy migration to production databases
What Makes It Different?
| Feature | SQLite | PostgreSQL | DuckDB | HeliosDB Nano |
|---|---|---|---|---|
| Embedded Mode | ✅ | ❌ | ✅ | ✅ |
| Server Mode | ❌ | ✅ | ❌ | ✅ |
| Vector Search | ❌ | Extension | ❌ | ✅ Built-in |
| Product Quantization | ❌ | ❌ | ❌ | ✅ 8-16x compression |
| Time-Travel Queries | ❌ | ❌ | ❌ | ✅ AS OF TIMESTAMP |
| Database Branching | ❌ | ❌ | ❌ | ✅ Git-like |
| Built-in Encryption | ❌ | Extension | ❌ | ✅ TDE |
| PostgreSQL Compatible | ❌ | ✅ | ❌ | ✅ 95%+ |
Installation
Prerequisites
- Rust: Version 1.75 or higher
- Operating System: Linux, macOS, or Windows
- Memory: 512MB RAM minimum (2GB+ recommended for vector workloads)
- Disk Space: 100MB for installation + your data
Option 1: Install with Cargo (Recommended)
# Install from crates.iocargo install heliosdb-nano
# Verify installationheliosdb-nano --version# Output: heliosdb-nano 2.0.0Option 2: Build from Source
# Clone repositorygit clone https://github.com/dimensigon/HeliosDB Nano.gitcd HeliosDB Nano
# Build release binarycargo build --release
# Binary location./target/release/heliosdb-nano --versionOption 3: Add as Library Dependency
Add to your Cargo.toml:
[dependencies]heliosdb-nano = "2.0.0"Option 4: Docker (Coming Soon)
# Pull imagedocker pull heliosdb/lite:2.0.0
# Run containerdocker run -p 5432:5432 heliosdb/lite:2.0.0Verify Installation
# Check versionheliosdb-nano --version
# Show helpheliosdb-nano --help
# Start REPL (interactive shell)heliosdb-nano repl --memoryIf you see the HeliosDB Nano welcome banner, you’re ready to go!
Quick Start (5-Minute Tutorial)
Let’s create your first database and run some queries. This tutorial will take you from zero to querying data in 5 minutes.
Step 1: Start the Interactive REPL
# Start in-memory database (data lost on exit)heliosdb-nano repl --memoryYou’ll see:
HeliosDB Nano v2.0.0 PostgreSQL-compatible embedded database with Phase 3 features
✨ New in v2.0: • 8-16x vector compression (Product Quantization) • Database branching (git-like workflows) • Time-travel queries (AS OF TIMESTAMP) • Intelligent materialized views
Type \h for help, \dS for system views, \q to quit
heliosdb>Step 2: Create Your First Table
CREATE TABLE users ( id INT PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL, age INT);Output:
Query OK (0.002s)Step 3: Insert Data
INSERT INTO users VALUES (1, 'Alice Johnson', 'alice@example.com', 30);INSERT INTO users VALUES (2, 'Bob Smith', 'bob@example.com', 25);INSERT INTO users VALUES (3, 'Charlie Brown', 'charlie@example.com', 35);Output:
Query OK, 1 row affected (0.001s)Query OK, 1 row affected (0.001s)Query OK, 1 row affected (0.001s)Step 4: Query Data
SELECT * FROM users;Output:
+----+----------------+---------------------+-----+| id | name | email | age |+----+----------------+---------------------+-----+| 1 | Alice Johnson | alice@example.com | 30 || 2 | Bob Smith | bob@example.com | 25 || 3 | Charlie Brown | charlie@example.com | 35 |+----+----------------+---------------------+-----+(3 rows) (0.001s)Step 5: Filter and Aggregate
-- Find users older than 28SELECT name, age FROM users WHERE age > 28;
-- Count total usersSELECT COUNT(*) as total_users FROM users;
-- Average ageSELECT AVG(age) as average_age FROM users;Step 6: Update and Delete
-- Update Alice's ageUPDATE users SET age = 31 WHERE name = 'Alice Johnson';
-- Delete young usersDELETE FROM users WHERE age < 28;
-- Verify changesSELECT * FROM users;Step 7: Exit the REPL
\qCongratulations! You’ve just created your first database, inserted data, and run queries. In the next sections, we’ll explore more advanced features.
Core Concepts
Databases and Tables
HeliosDB Nano organizes data into tables within a database. Unlike PostgreSQL, which supports multiple databases per server, HeliosDB Nano focuses on a single database per instance (similar to SQLite).
Table Structure:
CREATE TABLE table_name ( column1 TYPE constraints, column2 TYPE constraints, ...);Supported Data Types:
- Numeric:
INT,BIGINT,FLOAT,DOUBLE - Text:
TEXT,VARCHAR(n) - Binary:
BLOB,BYTEA - JSON:
JSONB(binary JSON for efficient storage) - Vector:
VECTOR(n)(for embeddings, n = dimensions) - Temporal:
TIMESTAMP,DATE,TIME - Boolean:
BOOLEAN
Indexes
Indexes speed up queries by creating optimized data structures:
Standard B-Tree Index:
CREATE INDEX idx_users_email ON users(email);Vector Index (HNSW):
CREATE INDEX idx_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops);Composite Index:
CREATE INDEX idx_user_age_name ON users(age, name);Transactions
Transactions ensure data integrity with ACID guarantees:
ACID Properties:
- Atomicity: All operations succeed or all fail
- Consistency: Database moves from one valid state to another
- Isolation: Concurrent transactions don’t interfere
- Durability: Committed changes survive crashes
Transaction Isolation: HeliosDB Nano uses Snapshot Isolation (similar to PostgreSQL’s REPEATABLE READ):
- Each transaction sees a consistent snapshot
- No dirty reads or non-repeatable reads
- Write conflicts are detected and prevented
Storage Modes
In-Memory Mode:
let db = EmbeddedDatabase::new_in_memory()?;- Data stored in RAM only
- Fastest performance
- Data lost on exit
- Perfect for testing
Persistent Mode:
let db = EmbeddedDatabase::new("./my_database")?;- Data written to disk
- Survives restarts
- ACID guarantees
- Production use
Encrypted Mode:
let config = Config { encryption_key: Some(key), ..Default::default()};let db = EmbeddedDatabase::with_config("./my_db", config)?;- AES-256-GCM encryption
- Transparent encryption/decryption
- Less than 3% overhead
- Compliance-ready
Basic Operations
Creating Tables
Simple Table:
CREATE TABLE products ( id INT PRIMARY KEY, name TEXT NOT NULL, price FLOAT, stock INT DEFAULT 0);Table with Constraints:
CREATE TABLE orders ( order_id BIGINT PRIMARY KEY, customer_id INT NOT NULL, product_id INT NOT NULL, quantity INT CHECK (quantity > 0), total_price FLOAT NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (customer_id) REFERENCES users(id), FOREIGN KEY (product_id) REFERENCES products(id));Viewing Table Schema:
-- In REPL\d products
-- SQL standardSELECT column_name, data_typeFROM information_schema.columnsWHERE table_name = 'products';Inserting Data
Single Row:
INSERT INTO products (id, name, price, stock)VALUES (1, 'Laptop', 1299.99, 50);Multiple Rows:
INSERT INTO products VALUES (2, 'Mouse', 29.99, 200), (3, 'Keyboard', 79.99, 150), (4, 'Monitor', 399.99, 75);Insert with SELECT:
INSERT INTO high_value_products (id, name, price)SELECT id, name, price FROM products WHERE price > 100;Returning Inserted Data:
INSERT INTO products (id, name, price)VALUES (5, 'Headphones', 199.99)RETURNING id, name;Querying Data
Basic SELECT:
SELECT * FROM products;SELECT name, price FROM products;Filtering with WHERE:
-- Comparison operatorsSELECT * FROM products WHERE price > 100;SELECT * FROM products WHERE stock <= 50;
-- Logical operatorsSELECT * FROM products WHERE price > 50 AND stock > 0;SELECT * FROM products WHERE name = 'Laptop' OR name = 'Monitor';
-- Pattern matchingSELECT * FROM products WHERE name LIKE '%top%';SELECT * FROM products WHERE name LIKE 'Key%';
-- Range queriesSELECT * FROM products WHERE price BETWEEN 50 AND 500;SELECT * FROM products WHERE id IN (1, 3, 5, 7);
-- NULL checksSELECT * FROM products WHERE stock IS NULL;SELECT * FROM products WHERE stock IS NOT NULL;Sorting with ORDER BY:
-- Ascending (default)SELECT * FROM products ORDER BY price;
-- DescendingSELECT * FROM products ORDER BY price DESC;
-- Multiple columnsSELECT * FROM products ORDER BY price DESC, name ASC;Limiting Results:
-- Top 5 most expensive productsSELECT * FROM products ORDER BY price DESC LIMIT 5;
-- Skip first 10, get next 20SELECT * FROM products LIMIT 20 OFFSET 10;
-- Pagination (page 3, 10 items per page)SELECT * FROM products LIMIT 10 OFFSET 20;Distinct Values:
-- Unique categoriesSELECT DISTINCT category FROM products;
-- Count unique customersSELECT COUNT(DISTINCT customer_id) FROM orders;Updating Data
Update Single Column:
UPDATE products SET price = 1199.99 WHERE id = 1;Update Multiple Columns:
UPDATE productsSET price = price * 0.9, stock = stock - 10WHERE category = 'electronics';Conditional Updates:
-- Mark out-of-stock productsUPDATE products SET stock = 0 WHERE stock < 0;
-- Update with calculationUPDATE products SET price = price * 1.1 WHERE last_updated < '2025-01-01';Deleting Data
Delete Specific Rows:
DELETE FROM products WHERE id = 5;DELETE FROM products WHERE stock = 0 AND discontinued = true;Delete All Rows (careful!):
DELETE FROM products; -- Keeps table structureDrop Table:
DROP TABLE products; -- Removes table entirelyDROP TABLE IF EXISTS products; -- No error if doesn't existAggregations
Basic Aggregates:
SELECT COUNT(*) FROM products;SELECT SUM(price) FROM products;SELECT AVG(price) FROM products;SELECT MIN(price) FROM products;SELECT MAX(price) FROM products;GROUP BY:
-- Products per categorySELECT category, COUNT(*) as product_countFROM productsGROUP BY category;
-- Revenue per customerSELECT customer_id, SUM(total_price) as total_revenueFROM ordersGROUP BY customer_id;
-- Multiple aggregatesSELECT category, COUNT(*) as count, AVG(price) as avg_price, MIN(price) as min_price, MAX(price) as max_priceFROM productsGROUP BY category;HAVING Clause:
-- Categories with more than 10 productsSELECT category, COUNT(*) as countFROM productsGROUP BY categoryHAVING COUNT(*) > 10;
-- High-value customersSELECT customer_id, SUM(total_price) as revenueFROM ordersGROUP BY customer_idHAVING SUM(total_price) > 10000ORDER BY revenue DESC;Joins
INNER JOIN:
-- Orders with customer namesSELECT orders.order_id, users.name as customer_name, orders.total_priceFROM ordersINNER JOIN users ON orders.customer_id = users.id;LEFT JOIN:
-- All customers, with or without ordersSELECT users.name, COUNT(orders.order_id) as order_countFROM usersLEFT JOIN orders ON users.id = orders.customer_idGROUP BY users.id, users.name;Multiple Joins:
-- Full order detailsSELECT orders.order_id, users.name as customer, products.name as product, orders.quantity, orders.total_priceFROM ordersINNER JOIN users ON orders.customer_id = users.idINNER JOIN products ON orders.product_id = products.id;Transactions
Automatic Transaction:
// Embedded API - auto-commitdb.execute("INSERT INTO users VALUES (1, 'Alice')")?;Explicit Transaction:
// Begin transactionlet tx = db.begin_transaction()?;
// Multiple operationstx.execute("INSERT INTO accounts VALUES (1, 1000)")?;tx.execute("INSERT INTO accounts VALUES (2, 500)")?;tx.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")?;tx.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")?;
// Commit (or rollback on error)tx.commit()?;SQL Transactions:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- If everything looks goodCOMMIT;
-- Or if there's an issueROLLBACK;Vector Search
Vector search enables semantic similarity searches for AI/ML applications. Instead of exact text matching, you can find items based on meaning.
Use Cases
- RAG (Retrieval-Augmented Generation): Find relevant context for LLMs
- Semantic Search: “Find products similar to this description”
- Recommendation Systems: “Find users with similar preferences”
- Image Search: Find visually similar images using embeddings
- Duplicate Detection: Find near-duplicate documents
Creating a Vector Table
CREATE TABLE documents ( id INT PRIMARY KEY, content TEXT, embedding VECTOR(384) -- 384 dimensions (common for embeddings));Common Dimensions:
- 384: sentence-transformers/all-MiniLM-L6-v2
- 768: BERT, OpenAI text-embedding-ada-002 (v1)
- 1536: OpenAI text-embedding-ada-002 (v2)
- 512: CLIP image embeddings
Creating a Vector Index
-- Cosine distance (most common for embeddings)CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops);
-- L2 distance (Euclidean)CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_l2_ops);
-- Inner productCREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_ip_ops);With Product Quantization (8-16x compression):
CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops)WITH ( quantization = 'product', pq_subquantizers = 8, pq_centroids = 256);Inserting Vectors
From Application Code:
use heliosdb_nano::EmbeddedDatabase;
let db = EmbeddedDatabase::new_in_memory()?;
// Create tabledb.execute(" CREATE TABLE documents ( id INT PRIMARY KEY, content TEXT, embedding VECTOR(384) )")?;
// Sample embedding (in real app, from OpenAI/Cohere/etc.)let embedding = vec![0.1, 0.2, 0.3, /* ... 381 more values */];let embedding_str = format!("[{}]", embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
// Insert with embeddingdb.execute(&format!( "INSERT INTO documents (id, content, embedding) VALUES (1, 'Rust tutorial', '{}')", embedding_str))?;SQL with Literal Vector:
INSERT INTO documents (id, content, embedding) VALUES (1, 'Machine learning basics', '[0.12, 0.34, 0.56, ...]'), (2, 'Deep learning tutorial', '[0.15, 0.38, 0.52, ...]');Searching with Vectors
Find Top-K Similar Items:
-- Find 10 most similar documents (cosine distance)SELECT id, content, embedding <=> '[0.1, 0.2, 0.3, ...]' AS distanceFROM documentsORDER BY distanceLIMIT 10;Distance Operators:
<->: L2 distance (Euclidean)<=>: Cosine distance (most common)<#>: Inner product (dot product)
With Filters:
-- Semantic search within categorySELECT id, content, embedding <=> $1 AS distanceFROM documentsWHERE category = 'technology' AND published_date > '2025-01-01'ORDER BY distanceLIMIT 10;Distance Threshold:
-- Only return very similar itemsSELECT id, content, embedding <=> $1 AS distanceFROM documentsWHERE (embedding <=> $1) < 0.3 -- Distance thresholdORDER BY distance;Real-World Example: RAG System
use heliosdb_nano::EmbeddedDatabase;
// Initialize databaselet db = EmbeddedDatabase::new("./rag_db")?;
// Create knowledge base tabledb.execute(" CREATE TABLE knowledge_base ( id BIGINT PRIMARY KEY, source TEXT, content TEXT, embedding VECTOR(1536) -- OpenAI embeddings )")?;
// Create indexdb.execute(" CREATE INDEX idx_kb_embeddings ON knowledge_base USING hnsw (embedding vector_cosine_ops) WITH (quantization = 'product', pq_subquantizers = 8)")?;
// Function to get embedding from OpenAIasync fn get_embedding(text: &str) -> Vec<f32> { // Call OpenAI API // Return embedding}
// Index a documentasync fn index_document(db: &EmbeddedDatabase, doc: &str) -> Result<()> { let embedding = get_embedding(doc).await; let embedding_str = format!("[{}]", embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
db.execute(&format!( "INSERT INTO knowledge_base (id, source, content, embedding) VALUES ({}, 'docs', '{}', '{}')", generate_id(), doc, embedding_str ))?; Ok(())}
// Retrieve relevant contextasync fn retrieve_context(db: &EmbeddedDatabase, query: &str, k: usize) -> Result<Vec<String>> { let query_embedding = get_embedding(query).await; let embedding_str = format!("[{}]", query_embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
let results = db.query(&format!( "SELECT content FROM knowledge_base ORDER BY embedding <=> '{}' LIMIT {}", embedding_str, k ), &[])?;
Ok(results.iter() .map(|row| row.get(0).unwrap().to_string()) .collect())}
// Usage in RAG pipelineasync fn generate_answer(query: &str) -> String { let db = EmbeddedDatabase::new("./rag_db").unwrap();
// 1. Retrieve relevant context let context = retrieve_context(&db, query, 5).await.unwrap();
// 2. Build prompt let prompt = format!( "Context: {}\n\nQuestion: {}\n\nAnswer:", context.join("\n\n"), query );
// 3. Call LLM (GPT-4, Claude, etc.) call_llm(&prompt).await}Vector Performance Tips
1. Use Product Quantization for Large Datasets:
-- 100K+ vectors: Use PQ to save memoryCREATE INDEX idx WITH (quantization = 'product', pq_subquantizers = 8);2. Batch Insert Vectors:
// Better: Batch insertlet mut values = Vec::new();for (id, content, embedding) in documents { values.push(format!("({}, '{}', '{}')", id, content, embedding_str));}db.execute(&format!("INSERT INTO documents VALUES {}", values.join(", ")))?;
// Worse: Individual insertsfor doc in documents { db.execute(&format!("INSERT INTO documents VALUES ..."))?;}3. Choose Right Distance Metric:
- Cosine: Normalized embeddings (OpenAI, most models) - USE THIS if unsure
- L2: Absolute distances matter
- Inner Product: Pre-normalized vectors, maximum similarity
4. Tune HNSW Parameters:
CREATE INDEX idx ON documents USING hnsw (embedding vector_cosine_ops)WITH ( m = 16, -- Graph connectivity (higher = better recall, more memory) ef_construction = 200 -- Build quality (higher = better quality, slower build));Time-Travel Queries
Time-travel queries let you view data as it existed at any point in time. This is powerful for auditing, debugging, and recovering from mistakes.
How It Works
HeliosDB Nano uses Multi-Version Concurrency Control (MVCC) to keep historical versions of data. Every change creates a new version without overwriting the old one.
Storage Retention: By default, all versions are kept. Configure retention policy:
let config = Config { retention_days: Some(30), // Keep 30 days of history ..Default::default()};AS OF TIMESTAMP
Query data as it existed at a specific timestamp:
-- View orders table as of yesterdaySELECT * FROM ordersAS OF TIMESTAMP '2025-11-20 00:00:00';
-- Count orders from last weekSELECT COUNT(*) FROM ordersAS OF TIMESTAMP '2025-11-14 00:00:00';
-- Compare current vs historicalSELECT current.product_id, current.price as current_price, historical.price as yesterday_price, current.price - historical.price as price_changeFROM products currentJOIN products AS OF TIMESTAMP '2025-11-20 00:00:00' historical ON current.product_id = historical.product_idWHERE current.price != historical.price;AS OF TRANSACTION
Query data as of a specific transaction ID:
-- View data at transaction 12345SELECT * FROM orders AS OF TRANSACTION 12345;
-- Get transaction IDSELECT pg_current_xact_id(); -- Returns current transaction IDAS OF SCN
Query data as of a System Change Number:
-- View data at SCN 987654321SELECT * FROM orders AS OF SCN 987654321;Real-World Use Cases
1. Audit Trail:
-- Show what customer saw when they complainedSELECT * FROM product_catalogAS OF TIMESTAMP '2025-11-15 14:23:00'WHERE product_id = 'WIDGET-123';
-- Verify price at time of purchaseSELECT price FROM productsAS OF TIMESTAMP (SELECT created_at FROM orders WHERE order_id = 5678)WHERE product_id = 42;2. Data Recovery:
-- Find accidentally deleted rowsSELECT * FROM customersAS OF TIMESTAMP '2025-11-20 23:00:00' -- Before deletionWHERE customer_id NOT IN (SELECT customer_id FROM customers);
-- Restore deleted dataINSERT INTO customersSELECT * FROM customers AS OF TIMESTAMP '2025-11-20 23:00:00'WHERE customer_id = 999;3. Debugging:
-- What changed between two points in time?SELECT before.order_id, before.status as old_status, after.status as new_status, before.total as old_total, after.total as new_totalFROM orders AS OF TIMESTAMP '2025-11-20 00:00:00' beforeFULL OUTER JOIN orders AS OF TIMESTAMP '2025-11-21 00:00:00' after USING (order_id)WHERE before.status != after.status OR before.total != after.total;4. Trend Analysis:
-- Daily inventory levels over past weekSELECT '2025-11-14' as date, SUM(stock) FROM products AS OF TIMESTAMP '2025-11-14 23:59:59'UNION ALLSELECT '2025-11-15', SUM(stock) FROM products AS OF TIMESTAMP '2025-11-15 23:59:59'UNION ALLSELECT '2025-11-16', SUM(stock) FROM products AS OF TIMESTAMP '2025-11-16 23:59:59'-- ... etc5. Compliance and Reporting:
-- End-of-quarter snapshot for financial reportingSELECT customer_id, SUM(total_price) as quarterly_revenueFROM ordersAS OF TIMESTAMP '2025-09-30 23:59:59'WHERE created_at BETWEEN '2025-07-01' AND '2025-09-30'GROUP BY customer_id;Performance Considerations
1. Recent Data is Faster:
- Queries on recent timestamps are faster (fewer versions to traverse)
- Very old timestamps may be slower
2. Index Usage:
-- Indexes work with time-travel queriesCREATE INDEX idx_orders_date ON orders(created_at);
SELECT * FROM ordersAS OF TIMESTAMP '2025-11-20 00:00:00'WHERE created_at > '2025-11-01'; -- Uses index3. Storage Management:
// Configure retention to manage storagelet config = Config { retention_days: Some(90), // Keep 90 days compact_interval_hours: 24, // Compact daily ..Default::default()};Database Branching
Database branching brings git-like workflows to your database. Create isolated copies, make changes, test thoroughly, then merge or discard.
Why Use Branching?
Traditional Problem:
- Testing on production is risky
- Staging databases drift from production
- Rolling back is hard
- Zero-downtime deployments are complex
Branching Solution:
- Create instant branch from production
- Test changes in isolation
- Merge back when ready
- Or discard if tests fail
Creating Branches
Basic Branch:
-- Create branch from current stateCREATE DATABASE BRANCH staging FROM CURRENT AS OF NOW;Branch from Specific Time:
-- Create branch from yesterdayCREATE DATABASE BRANCH test FROM mainAS OF TIMESTAMP '2025-11-20 00:00:00';Branch from Transaction:
-- Create branch from specific transactionCREATE DATABASE BRANCH backup FROM CURRENTAS OF TRANSACTION 987654;With Options:
CREATE DATABASE BRANCH prod_copy FROM main AS OF NOWWITH ( replication_factor = 3, region = 'us-west-2');Working with Branches
Switch Between Branches:
// In application codelet db = EmbeddedDatabase::new_branch("./mydb", "staging")?;List Branches:
-- View all branchesSELECT * FROM pg_database_branches();
-- Output:-- branch_name | parent_branch | created_at | size_bytes-- main | NULL | 2025-11-01 00:00:00 | 157286400-- staging | main | 2025-11-20 10:00:00 | 157286400-- dev | main | 2025-11-20 11:00:00 | 157286400Branch Metadata:
-- Check branch detailsSELECT branch_name, ROUND(size_mb::FLOAT / 1024, 2) as size_gb, created_at, statusFROM pg_database_branches()WHERE status = 'active';Merging Branches
Simple Merge:
-- Merge staging into mainMERGE DATABASE BRANCH staging INTO main;Merge with Conflict Resolution:
MERGE DATABASE BRANCH staging INTO mainWITH ( conflict_resolution = 'branch_wins', -- Source wins conflicts delete_branch_after = true -- Auto-cleanup);Conflict Resolution Options:
branch_wins: Source branch changes override targettarget_wins: Target branch changes override sourcefail: Fail on any conflict (manual resolution required)
Deleting Branches
-- Delete branchDROP DATABASE BRANCH staging;
-- Delete if exists (no error)DROP DATABASE BRANCH IF EXISTS staging;Real-World Workflows
1. Development Workflow:
-- 1. Create dev branchCREATE DATABASE BRANCH dev FROM main AS OF NOW;
-- 2. Developer works in dev branch-- (make changes, test, iterate)
-- 3. Run tests-- (application-level testing)
-- 4. Merge to staging for QACREATE DATABASE BRANCH staging FROM dev AS OF NOW;
-- 5. QA approves, merge to mainMERGE DATABASE BRANCH staging INTO mainWITH (conflict_resolution = 'branch_wins', delete_branch_after = true);
-- 6. Cleanup dev branchDROP DATABASE BRANCH dev;2. Blue-Green Deployment:
-- 1. Production runs on 'blue' branch-- (current production environment)
-- 2. Create 'green' branch for new versionCREATE DATABASE BRANCH green FROM blue AS OF NOW;
-- 3. Deploy new application version to green-- (application connects to green branch)
-- 4. Run tests on green-- (smoke tests, integration tests)
-- 5. Switch traffic to green (at application layer)-- (update connection string to use green)
-- 6. Monitor for issues-- (if issues found, switch back to blue instantly)
-- 7. If successful, make green the new blueMERGE DATABASE BRANCH green INTO blue;DROP DATABASE BRANCH green;CREATE DATABASE BRANCH green FROM blue AS OF NOW; -- Prepare for next deployment3. Testing Schema Changes:
-- 1. Create test branchCREATE DATABASE BRANCH schema_test FROM main AS OF NOW;
-- 2. Apply schema changes in test branchALTER TABLE users ADD COLUMN preferences JSONB;CREATE INDEX idx_user_prefs ON users(preferences);
-- 3. Test queriesSELECT * FROM users WHERE preferences->>'theme' = 'dark';
-- 4. If tests pass, apply to mainMERGE DATABASE BRANCH schema_test INTO main;
-- 5. CleanupDROP DATABASE BRANCH schema_test;4. A/B Testing:
-- 1. Create two branches for A/B testCREATE DATABASE BRANCH variant_a FROM main AS OF NOW;CREATE DATABASE BRANCH variant_b FROM main AS OF NOW;
-- 2. Apply different changes to each-- In variant_a:UPDATE product_recommendations SET algorithm = 'collaborative_filtering';
-- In variant_b:UPDATE product_recommendations SET algorithm = 'content_based';
-- 3. Run A/B test (application routes users to branches)-- (collect metrics, analyze results)
-- 4. Merge winning variantMERGE DATABASE BRANCH variant_a INTO main;
-- 5. CleanupDROP DATABASE BRANCH variant_a;DROP DATABASE BRANCH variant_b;5. Disaster Recovery:
-- Regular backups as branchesCREATE DATABASE BRANCH backup_daily FROM main AS OF NOW;
-- If disaster strikes:-- 1. Create new main from backupCREATE DATABASE BRANCH main_new FROM backup_daily AS OF NOW;
-- 2. Verify data integritySELECT COUNT(*) FROM users;
-- 3. Replace mainDROP DATABASE BRANCH main;ALTER DATABASE BRANCH main_new RENAME TO main;Branch Management Best Practices
1. Naming Convention:
<purpose>-<timestamp><purpose>-<version><team>-<feature>
Examples:- dev-20251121- staging-v2.1- data-team-analytics- backup-daily2. Lifecycle:
Create → Work → Test → Merge → Delete3. Monitor Size:
-- Find large branchesSELECT branch_name, size_mbFROM pg_database_branches()WHERE size_mb > 1000ORDER BY size_mb DESC;4. Automated Cleanup:
-- Drop old test branches (pseudocode for scheduled job)DO $$DECLARE branch_rec RECORD;BEGIN FOR branch_rec IN SELECT branch_name FROM pg_database_branches() WHERE branch_name LIKE 'test-%' AND created_at < NOW() - INTERVAL '7 days' LOOP EXECUTE 'DROP DATABASE BRANCH ' || branch_rec.branch_name; END LOOP;END $$;Materialized Views
Materialized Views (MVs) are precomputed query results stored as tables. They speed up complex queries at the cost of some staleness.
Why Use Materialized Views?
Problem: Complex aggregate queries are slow
-- This query is slow on millions of rowsSELECT customer_id, COUNT(*) as order_count, SUM(total) as revenue, AVG(total) as avg_order_valueFROM ordersGROUP BY customer_id;Solution: Precompute and cache the results
CREATE MATERIALIZED VIEW customer_stats ASSELECT customer_id, COUNT(*) as order_count, SUM(total) as revenue, AVG(total) as avg_order_valueFROM ordersGROUP BY customer_id;
-- Now this is instantSELECT * FROM customer_stats WHERE customer_id = 123;Creating Materialized Views
Basic MV:
CREATE MATERIALIZED VIEW daily_sales ASSELECT DATE(created_at) as date, COUNT(*) as orders, SUM(total) as revenueFROM ordersGROUP BY DATE(created_at);MV with Indexes:
-- Create MVCREATE MATERIALIZED VIEW user_stats ASSELECT user_id, COUNT(*) as order_count, SUM(total) as revenueFROM ordersGROUP BY user_id;
-- Add index for fast lookupsCREATE INDEX idx_user_stats_revenue ON user_stats(revenue DESC);Auto-Refresh MV:
CREATE MATERIALIZED VIEW product_popularity ASSELECT product_id, COUNT(*) as view_count, COUNT(DISTINCT user_id) as unique_viewersFROM product_viewsGROUP BY product_idWITH ( auto_refresh = true, -- Enable automatic refresh max_cpu_percent = 15, -- Limit CPU usage to 15% threshold_dml_rate = 100, -- Refresh when 100+ DMLs/sec lazy_update = true, -- Defer during high load lazy_catchup_window = '1 hour' -- Max staleness allowed);Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
auto_refresh | boolean | false | Enable background auto-refresh |
max_cpu_percent | float | 15.0 | Maximum CPU usage (0-100) |
threshold_table_size | string | ’0’ | Min base table size to trigger refresh |
threshold_dml_rate | int | 0 | DMLs/sec to trigger refresh |
lazy_update | boolean | false | Defer refresh during high load |
lazy_catchup_window | string | ’1h’ | Maximum staleness allowed |
Refreshing Materialized Views
Manual Refresh:
-- Blocking refresh (locks the MV)REFRESH MATERIALIZED VIEW user_stats;
-- Concurrent refresh (non-blocking, users can still query)REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;Automatic Refresh:
-- HeliosDB Nano automatically refreshes based on:-- 1. DML rate (inserts/updates/deletes per second)-- 2. CPU availability (respects max_cpu_percent)-- 3. Staleness (lazy_catchup_window)
-- Check auto-refresh statusSELECT * FROM pg_mv_staleness();Monitoring Materialized Views
Check Staleness:
SELECT view_name, staleness_sec / 60 as staleness_minutes, pending_changes, statusFROM pg_mv_staleness()ORDER BY staleness_sec DESC;Output:
view_name | staleness_minutes | pending_changes | status-----------------+-------------------+-----------------+-----------user_stats | 12.5 | 1250 | staleproduct_stats | 0.5 | 50 | freshdaily_sales | 0.0 | 0 | freshCheck CPU Usage:
SELECT view_name, max_cpu_percent, current_cpu_percent, statusFROM pg_mv_cpu_usage();Dropping Materialized Views
-- Drop MVDROP MATERIALIZED VIEW user_stats;
-- Drop if existsDROP MATERIALIZED VIEW IF EXISTS user_stats;Real-World Use Cases
1. Analytics Dashboard:
-- Instead of slow live queriesCREATE MATERIALIZED VIEW dashboard_metrics ASSELECT DATE(created_at) as date, COUNT(*) as total_orders, SUM(total) as revenue, COUNT(DISTINCT customer_id) as unique_customers, AVG(total) as avg_order_valueFROM ordersGROUP BY DATE(created_at)WITH ( auto_refresh = true, max_cpu_percent = 10, threshold_dml_rate = 50);
-- Dashboard queries are now instantSELECT * FROM dashboard_metricsWHERE date >= CURRENT_DATE - INTERVAL '30 days'ORDER BY date DESC;2. Leaderboard:
CREATE MATERIALIZED VIEW user_leaderboard ASSELECT user_id, users.username, COUNT(*) as games_played, SUM(score) as total_score, AVG(score) as avg_score, MAX(score) as high_score, RANK() OVER (ORDER BY SUM(score) DESC) as rankFROM game_resultsJOIN users ON game_results.user_id = users.idGROUP BY user_id, users.usernameWITH ( auto_refresh = true, max_cpu_percent = 20, lazy_catchup_window = '5 minutes');
-- Leaderboard API is fastSELECT * FROM user_leaderboard ORDER BY rank LIMIT 100;3. E-commerce Product Stats:
CREATE MATERIALIZED VIEW product_stats ASSELECT products.product_id, products.name, COUNT(DISTINCT orders.customer_id) as buyers, COUNT(orders.order_id) as times_purchased, SUM(orders.quantity) as units_sold, SUM(orders.total) as revenue, AVG(reviews.rating) as avg_rating, COUNT(reviews.review_id) as review_countFROM productsLEFT JOIN orders ON products.product_id = orders.product_idLEFT JOIN reviews ON products.product_id = reviews.product_idGROUP BY products.product_id, products.nameWITH (auto_refresh = true, max_cpu_percent = 15);
-- Product page loads instantlySELECT * FROM product_stats WHERE product_id = 123;4. Real-Time Inventory:
CREATE MATERIALIZED VIEW inventory_status ASSELECT warehouse_id, product_id, SUM(quantity) as on_hand, SUM(CASE WHEN status = 'reserved' THEN quantity ELSE 0 END) as reserved, SUM(quantity) - SUM(CASE WHEN status = 'reserved' THEN quantity ELSE 0 END) as availableFROM inventoryGROUP BY warehouse_id, product_idWITH ( auto_refresh = true, max_cpu_percent = 25, threshold_dml_rate = 200, -- High rate for inventory updates lazy_catchup_window = '30 seconds' -- Near real-time);Performance Tuning
1. Start Conservative:
-- Begin with safe settingsWITH ( max_cpu_percent = 10, lazy_update = true, lazy_catchup_window = '1 hour')
-- Gradually increase based on monitoring2. Index the MV:
-- Create indexes for common queriesCREATE INDEX idx_user_stats_revenue ON user_stats(revenue DESC);CREATE INDEX idx_daily_sales_date ON daily_sales(date DESC);3. Partition Large MVs:
-- Instead of one huge MVCREATE MATERIALIZED VIEW sales_2025 ASSELECT * FROM orders WHERE YEAR(created_at) = 2025;
CREATE MATERIALIZED VIEW sales_2024 ASSELECT * FROM orders WHERE YEAR(created_at) = 2024;4. Choose Right Refresh Strategy:
-- Low-traffic: Aggressive refreshWITH (max_cpu_percent = 30, threshold_dml_rate = 10)
-- High-traffic: Conservative refreshWITH (max_cpu_percent = 10, lazy_update = true)
-- Critical: Manual refresh during maintenance windowWITH (auto_refresh = false)-- Then: REFRESH MATERIALIZED VIEW CONCURRENTLY during off-peakNext Steps
Congratulations! You’ve learned the fundamentals of HeliosDB Nano. Here’s what to explore next:
Advanced Features
1. Multi-Tenancy:
-- Row-Level Security for SaaS applicationsCREATE TABLE documents ( id BIGINT PRIMARY KEY, tenant_id INT NOT NULL, content TEXT);
CREATE POLICY tenant_isolation ON documents USING (tenant_id = current_tenant_id());2. Full-Text Search:
-- Coming in v2.1CREATE INDEX idx_fts ON documents USING gin(to_tsvector('english', content));SELECT * FROM documents WHERE to_tsvector(content) @@ to_tsquery('database & vector');3. JSONB Operations:
-- Store and query JSON efficientlyCREATE TABLE users ( id INT PRIMARY KEY, metadata JSONB);
INSERT INTO users VALUES (1, '{"name": "Alice", "age": 30, "tags": ["vip", "premium"]}');
SELECT * FROM users WHERE metadata->>'age' > '28';SELECT * FROM users WHERE metadata->'tags' ? 'vip';4. Encryption:
use heliosdb_nano::{EmbeddedDatabase, Config};use heliosdb_nano::crypto::EncryptionKey;
// Generate encryption keylet key = EncryptionKey::generate()?;
// Create encrypted databaselet config = Config { encryption_key: Some(key), ..Default::default()};let db = EmbeddedDatabase::with_config("./secure_db", config)?;
// All data is transparently encryptedLearning Resources
Documentation:
- SQL Reference - Complete SQL syntax
- Vector Search Guide - Deep dive into embeddings
- Phase 3 User Guide - Advanced v2.0 features
- System Catalog - System views reference
- API Documentation - Rust API reference
Examples:
# Browse example codels examples/# - quickstart.rs (basics)# - vector_search_demo.rs (semantic search)# - encryption_demo.rs (TDE)# - concurrent_mv_refresh_demo.rs (materialized views)Configuration:
Integration Guides
ORM Support:
# Read ORM integration guidecat docs/ORM_SUPPORT.mdServer Mode:
# Start PostgreSQL-compatible serverheliosdb-nano server --port 5432 --data ./mydb
# Connect with any PostgreSQL clientpsql postgresql://localhost:5432/mydbMigration:
# Migrate from PostgreSQL/SQLitecat docs/MIGRATION.mdCommunity and Support
Get Help:
- GitHub Issues: https://github.com/dimensigon/HeliosDB Nano/issues
- Documentation: https://docs.heliosdb.com/lite
- Community Forum: [Coming Soon]
Contributing:
- Read CONTRIBUTING.md
- Check open issues
- Submit pull requests
Stay Updated:
- Watch the repository
- Follow release notes
- Join mailing list [Coming Soon]
Project Ideas
1. Build a RAG System:
- Index documentation with embeddings
- Semantic search for relevant context
- Feed to LLM for accurate answers
2. E-commerce Backend:
- Multi-tenant product catalog
- Vector search for product recommendations
- Materialized views for analytics
3. Git-like Version Control:
- Use database branching for workflows
- Time-travel for audit trails
- Blue-green deployments
4. Real-Time Dashboard:
- Materialized views for aggregates
- Auto-refresh for near real-time data
- Vector clustering for insights
5. Mobile App Backend:
- Embedded mode for offline-first
- Server mode for sync
- Encryption for security
Quick Reference Card
Common Commands:
-- TablesCREATE TABLE name (col TYPE);DROP TABLE name;\d name -- Describe table (REPL)
-- CRUDINSERT INTO table VALUES (...);SELECT * FROM table WHERE condition;UPDATE table SET col = val WHERE condition;DELETE FROM table WHERE condition;
-- IndexesCREATE INDEX idx_name ON table(col);CREATE INDEX idx_vec ON table USING hnsw (vec vector_cosine_ops);
-- TransactionsBEGIN; ... COMMIT;BEGIN; ... ROLLBACK;
-- Vector SearchSELECT * FROM docs ORDER BY embedding <=> '[...]' LIMIT 10;
-- Time-TravelSELECT * FROM table AS OF TIMESTAMP '2025-11-20 00:00:00';
-- BranchingCREATE DATABASE BRANCH dev FROM main AS OF NOW;MERGE DATABASE BRANCH dev INTO main;DROP DATABASE BRANCH dev;
-- Materialized ViewsCREATE MATERIALIZED VIEW mv AS SELECT ...;REFRESH MATERIALIZED VIEW mv;DROP MATERIALIZED VIEW mv;
-- System ViewsSELECT * FROM pg_database_branches();SELECT * FROM pg_mv_staleness();SELECT * FROM pg_vector_index_stats();REPL Meta-Commands:
\h Help\d List tables\d table Describe table\dS List system views\dS view Describe system view\q QuitRust API:
// Create databaselet db = EmbeddedDatabase::new("./mydb")?;let db = EmbeddedDatabase::new_in_memory()?;
// Execute SQLdb.execute("CREATE TABLE ...")?;let count = db.execute("INSERT ...")?;
// Querylet results = db.query("SELECT * FROM users", &[])?;
// Transactionlet tx = db.begin_transaction()?;tx.execute("INSERT ...")?;tx.commit()?;Troubleshooting
Problem: “Table already exists” error
-- Solution: Use IF NOT EXISTSCREATE TABLE IF NOT EXISTS users (...);Problem: Vector dimensions mismatch
-- Solution: Ensure all vectors have same dimensionCREATE TABLE docs (embedding VECTOR(384)); -- All must be 384-dimProblem: Slow vector search
-- Solution: Create HNSW indexCREATE INDEX idx ON docs USING hnsw (embedding vector_cosine_ops);Problem: Materialized view is stale
-- Solution: Manual refreshREFRESH MATERIALIZED VIEW mv;
-- Or check auto-refresh settingsSELECT * FROM pg_mv_staleness();Problem: Out of memory with large vectors
-- Solution: Use Product QuantizationCREATE INDEX idx ON docs USING hnsw (embedding vector_cosine_ops)WITH (quantization = 'product', pq_subquantizers = 8);Glossary
- ACID: Atomicity, Consistency, Isolation, Durability (transaction guarantees)
- Embedding: Vector representation of data (text, image, etc.)
- HNSW: Hierarchical Navigable Small World (vector index algorithm)
- MVCC: Multi-Version Concurrency Control (enables time-travel)
- MV: Materialized View (cached query result)
- PQ: Product Quantization (vector compression technique)
- RAG: Retrieval-Augmented Generation (LLM + knowledge base)
- Semantic Search: Search by meaning, not keywords
- TDE: Transparent Data Encryption
Document Version: 1.0 Last Updated: November 21, 2025 Applies To: HeliosDB Nano v2.0.0 and later
For questions or feedback, please open an issue on [GitHub](https://github.com/dimensigon/HeliosDB Nano/issues).
Happy coding with HeliosDB Nano! 🚀