Getting Started with HeliosDB-Lite v2.0
Getting Started with HeliosDB-Lite v2.0
Version: 2.0.0 Last Updated: November 21, 2025 Difficulty: Beginner Estimated Time: 30 minutes
Table of Contents
- Introduction
- Installation
- Quick Start (5-Minute Tutorial)
- Core Concepts
- Basic Operations
- Vector Search
- Time-Travel Queries
- Database Branching
- Materialized Views
- Next Steps
Introduction
What is HeliosDB-Lite?
HeliosDB-Lite is a PostgreSQL-compatible embedded database that brings enterprise-grade features to your applications without the complexity of a traditional database server. Think of it as SQLite with superpowers.
Key Features:
- Embedded Mode: Run directly in your application process (like SQLite)
- Server Mode: Optional PostgreSQL-compatible network server
- Vector Search: Built-in HNSW index for AI/ML embeddings (OpenAI, Cohere, etc.)
- Product Quantization: 8-16x vector compression with minimal accuracy loss
- Time-Travel Queries: Query historical data at any point in time
- Database Branching: Git-like workflows for zero-downtime deployments
- Built-in Encryption: AES-256-GCM with less than 3% overhead
- Multi-Tenant Ready: Native tenant isolation with Row-Level Security
Who Should Use HeliosDB-Lite?
HeliosDB-Lite is perfect for:
- AI/ML Applications: Need vector search for embeddings (RAG, semantic search)
- Edge Computing: Embedded database with server capabilities
- SaaS Applications: Multi-tenant isolation without complexity
- Mobile/Desktop Apps: Rich SQL database without external dependencies
- Prototyping: PostgreSQL compatibility means easy migration to production databases
What Makes It Different?
| Feature | SQLite | PostgreSQL | DuckDB | HeliosDB-Lite |
|---|---|---|---|---|
| Embedded Mode | ✅ | ❌ | ✅ | ✅ |
| Server Mode | ❌ | ✅ | ❌ | ✅ |
| Vector Search | ❌ | Extension | ❌ | ✅ Built-in |
| Product Quantization | ❌ | ❌ | ❌ | ✅ 8-16x compression |
| Time-Travel Queries | ❌ | ❌ | ❌ | ✅ AS OF TIMESTAMP |
| Database Branching | ❌ | ❌ | ❌ | ✅ Git-like |
| Built-in Encryption | ❌ | Extension | ❌ | ✅ TDE |
| PostgreSQL Compatible | ❌ | ✅ | ❌ | ✅ 95%+ |
Installation
Prerequisites
- Rust: Version 1.75 or higher
- Operating System: Linux, macOS, or Windows
- Memory: 512MB RAM minimum (2GB+ recommended for vector workloads)
- Disk Space: 100MB for installation + your data
Option 1: Install with Cargo (Recommended)
# Install from crates.iocargo install heliosdb-lite
# Verify installationheliosdb-lite --version# Output: heliosdb-lite 2.0.0Option 2: Build from Source
# Clone repositorygit clone https://github.com/dimensigon/HeliosDB-Lite.gitcd HeliosDB-Lite
# Build release binarycargo build --release
# Binary location./target/release/heliosdb-lite --versionOption 3: Add as Library Dependency
Add to your Cargo.toml:
[dependencies]heliosdb-lite = "2.0.0"Option 4: Docker (Coming Soon)
# Pull imagedocker pull heliosdb/lite:2.0.0
# Run containerdocker run -p 5432:5432 heliosdb/lite:2.0.0Verify Installation
# Check versionheliosdb-lite --version
# Show helpheliosdb-lite --help
# Start REPL (interactive shell)heliosdb-lite repl --memoryIf you see the HeliosDB-Lite welcome banner, you’re ready to go!
Quick Start (5-Minute Tutorial)
Let’s create your first database and run some queries. This tutorial will take you from zero to querying data in 5 minutes.
Step 1: Start the Interactive REPL
# Start in-memory database (data lost on exit)heliosdb-lite repl --memoryYou’ll see:
HeliosDB Lite v2.0.0 PostgreSQL-compatible embedded database with Phase 3 features
✨ New in v2.0: • 8-16x vector compression (Product Quantization) • Database branching (git-like workflows) • Time-travel queries (AS OF TIMESTAMP) • Intelligent materialized views
Type \h for help, \dS for system views, \q to quit
heliosdb>Step 2: Create Your First Table
CREATE TABLE users ( id INT PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL, age INT);Output:
Query OK (0.002s)Step 3: Insert Data
INSERT INTO users VALUES (1, 'Alice Johnson', 'alice@example.com', 30);INSERT INTO users VALUES (2, 'Bob Smith', 'bob@example.com', 25);INSERT INTO users VALUES (3, 'Charlie Brown', 'charlie@example.com', 35);Output:
Query OK, 1 row affected (0.001s)Query OK, 1 row affected (0.001s)Query OK, 1 row affected (0.001s)Step 4: Query Data
SELECT * FROM users;Output:
+----+----------------+---------------------+-----+| id | name | email | age |+----+----------------+---------------------+-----+| 1 | Alice Johnson | alice@example.com | 30 || 2 | Bob Smith | bob@example.com | 25 || 3 | Charlie Brown | charlie@example.com | 35 |+----+----------------+---------------------+-----+(3 rows) (0.001s)Step 5: Filter and Aggregate
-- Find users older than 28SELECT name, age FROM users WHERE age > 28;
-- Count total usersSELECT COUNT(*) as total_users FROM users;
-- Average ageSELECT AVG(age) as average_age FROM users;Step 6: Update and Delete
-- Update Alice's ageUPDATE users SET age = 31 WHERE name = 'Alice Johnson';
-- Delete young usersDELETE FROM users WHERE age < 28;
-- Verify changesSELECT * FROM users;Step 7: Exit the REPL
\qCongratulations! You’ve just created your first database, inserted data, and run queries. In the next sections, we’ll explore more advanced features.
Core Concepts
Databases and Tables
HeliosDB-Lite organizes data into tables within a database. Unlike PostgreSQL, which supports multiple databases per server, HeliosDB-Lite focuses on a single database per instance (similar to SQLite).
Table Structure:
CREATE TABLE table_name ( column1 TYPE constraints, column2 TYPE constraints, ...);Supported Data Types:
- Numeric:
INT,BIGINT,FLOAT,DOUBLE - Text:
TEXT,VARCHAR(n) - Binary:
BLOB,BYTEA - JSON:
JSONB(binary JSON for efficient storage) - Vector:
VECTOR(n)(for embeddings, n = dimensions) - Temporal:
TIMESTAMP,DATE,TIME - Boolean:
BOOLEAN
Indexes
Indexes speed up queries by creating optimized data structures:
Standard B-Tree Index:
CREATE INDEX idx_users_email ON users(email);Vector Index (HNSW):
CREATE INDEX idx_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops);Composite Index:
CREATE INDEX idx_user_age_name ON users(age, name);Transactions
Transactions ensure data integrity with ACID guarantees:
ACID Properties:
- Atomicity: All operations succeed or all fail
- Consistency: Database moves from one valid state to another
- Isolation: Concurrent transactions don’t interfere
- Durability: Committed changes survive crashes
Transaction Isolation: HeliosDB-Lite uses Snapshot Isolation (similar to PostgreSQL’s REPEATABLE READ):
- Each transaction sees a consistent snapshot
- No dirty reads or non-repeatable reads
- Write conflicts are detected and prevented
Storage Modes
In-Memory Mode:
let db = EmbeddedDatabase::new_in_memory()?;- Data stored in RAM only
- Fastest performance
- Data lost on exit
- Perfect for testing
Persistent Mode:
let db = EmbeddedDatabase::new("./my_database")?;- Data written to disk
- Survives restarts
- ACID guarantees
- Production use
Encrypted Mode:
let config = Config { encryption_key: Some(key), ..Default::default()};let db = EmbeddedDatabase::with_config("./my_db", config)?;- AES-256-GCM encryption
- Transparent encryption/decryption
- Less than 3% overhead
- Compliance-ready
Basic Operations
Creating Tables
Simple Table:
CREATE TABLE products ( id INT PRIMARY KEY, name TEXT NOT NULL, price FLOAT, stock INT DEFAULT 0);Table with Constraints:
CREATE TABLE orders ( order_id BIGINT PRIMARY KEY, customer_id INT NOT NULL, product_id INT NOT NULL, quantity INT CHECK (quantity > 0), total_price FLOAT NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (customer_id) REFERENCES users(id), FOREIGN KEY (product_id) REFERENCES products(id));Viewing Table Schema:
-- In REPL\d products
-- SQL standardSELECT column_name, data_typeFROM information_schema.columnsWHERE table_name = 'products';Inserting Data
Single Row:
INSERT INTO products (id, name, price, stock)VALUES (1, 'Laptop', 1299.99, 50);Multiple Rows:
INSERT INTO products VALUES (2, 'Mouse', 29.99, 200), (3, 'Keyboard', 79.99, 150), (4, 'Monitor', 399.99, 75);Insert with SELECT:
INSERT INTO high_value_products (id, name, price)SELECT id, name, price FROM products WHERE price > 100;Returning Inserted Data:
INSERT INTO products (id, name, price)VALUES (5, 'Headphones', 199.99)RETURNING id, name;Querying Data
Basic SELECT:
SELECT * FROM products;SELECT name, price FROM products;Filtering with WHERE:
-- Comparison operatorsSELECT * FROM products WHERE price > 100;SELECT * FROM products WHERE stock <= 50;
-- Logical operatorsSELECT * FROM products WHERE price > 50 AND stock > 0;SELECT * FROM products WHERE name = 'Laptop' OR name = 'Monitor';
-- Pattern matchingSELECT * FROM products WHERE name LIKE '%top%';SELECT * FROM products WHERE name LIKE 'Key%';
-- Range queriesSELECT * FROM products WHERE price BETWEEN 50 AND 500;SELECT * FROM products WHERE id IN (1, 3, 5, 7);
-- NULL checksSELECT * FROM products WHERE stock IS NULL;SELECT * FROM products WHERE stock IS NOT NULL;Sorting with ORDER BY:
-- Ascending (default)SELECT * FROM products ORDER BY price;
-- DescendingSELECT * FROM products ORDER BY price DESC;
-- Multiple columnsSELECT * FROM products ORDER BY price DESC, name ASC;Limiting Results:
-- Top 5 most expensive productsSELECT * FROM products ORDER BY price DESC LIMIT 5;
-- Skip first 10, get next 20SELECT * FROM products LIMIT 20 OFFSET 10;
-- Pagination (page 3, 10 items per page)SELECT * FROM products LIMIT 10 OFFSET 20;Distinct Values:
-- Unique categoriesSELECT DISTINCT category FROM products;
-- Count unique customersSELECT COUNT(DISTINCT customer_id) FROM orders;Updating Data
Update Single Column:
UPDATE products SET price = 1199.99 WHERE id = 1;Update Multiple Columns:
UPDATE productsSET price = price * 0.9, stock = stock - 10WHERE category = 'electronics';Conditional Updates:
-- Mark out-of-stock productsUPDATE products SET stock = 0 WHERE stock < 0;
-- Update with calculationUPDATE products SET price = price * 1.1 WHERE last_updated < '2025-01-01';Deleting Data
Delete Specific Rows:
DELETE FROM products WHERE id = 5;DELETE FROM products WHERE stock = 0 AND discontinued = true;Delete All Rows (careful!):
DELETE FROM products; -- Keeps table structureDrop Table:
DROP TABLE products; -- Removes table entirelyDROP TABLE IF EXISTS products; -- No error if doesn't existAggregations
Basic Aggregates:
SELECT COUNT(*) FROM products;SELECT SUM(price) FROM products;SELECT AVG(price) FROM products;SELECT MIN(price) FROM products;SELECT MAX(price) FROM products;GROUP BY:
-- Products per categorySELECT category, COUNT(*) as product_countFROM productsGROUP BY category;
-- Revenue per customerSELECT customer_id, SUM(total_price) as total_revenueFROM ordersGROUP BY customer_id;
-- Multiple aggregatesSELECT category, COUNT(*) as count, AVG(price) as avg_price, MIN(price) as min_price, MAX(price) as max_priceFROM productsGROUP BY category;HAVING Clause:
-- Categories with more than 10 productsSELECT category, COUNT(*) as countFROM productsGROUP BY categoryHAVING COUNT(*) > 10;
-- High-value customersSELECT customer_id, SUM(total_price) as revenueFROM ordersGROUP BY customer_idHAVING SUM(total_price) > 10000ORDER BY revenue DESC;Joins
INNER JOIN:
-- Orders with customer namesSELECT orders.order_id, users.name as customer_name, orders.total_priceFROM ordersINNER JOIN users ON orders.customer_id = users.id;LEFT JOIN:
-- All customers, with or without ordersSELECT users.name, COUNT(orders.order_id) as order_countFROM usersLEFT JOIN orders ON users.id = orders.customer_idGROUP BY users.id, users.name;Multiple Joins:
-- Full order detailsSELECT orders.order_id, users.name as customer, products.name as product, orders.quantity, orders.total_priceFROM ordersINNER JOIN users ON orders.customer_id = users.idINNER JOIN products ON orders.product_id = products.id;Transactions
Automatic Transaction:
// Embedded API - auto-commitdb.execute("INSERT INTO users VALUES (1, 'Alice')")?;Explicit Transaction:
// Begin transactionlet tx = db.begin_transaction()?;
// Multiple operationstx.execute("INSERT INTO accounts VALUES (1, 1000)")?;tx.execute("INSERT INTO accounts VALUES (2, 500)")?;tx.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")?;tx.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")?;
// Commit (or rollback on error)tx.commit()?;SQL Transactions:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- If everything looks goodCOMMIT;
-- Or if there's an issueROLLBACK;Vector Search
Vector search enables semantic similarity searches for AI/ML applications. Instead of exact text matching, you can find items based on meaning.
Use Cases
- RAG (Retrieval-Augmented Generation): Find relevant context for LLMs
- Semantic Search: “Find products similar to this description”
- Recommendation Systems: “Find users with similar preferences”
- Image Search: Find visually similar images using embeddings
- Duplicate Detection: Find near-duplicate documents
Creating a Vector Table
CREATE TABLE documents ( id INT PRIMARY KEY, content TEXT, embedding VECTOR(384) -- 384 dimensions (common for embeddings));Common Dimensions:
- 384: sentence-transformers/all-MiniLM-L6-v2
- 768: BERT, OpenAI text-embedding-ada-002 (v1)
- 1536: OpenAI text-embedding-ada-002 (v2)
- 512: CLIP image embeddings
Creating a Vector Index
-- Cosine distance (most common for embeddings)CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops);
-- L2 distance (Euclidean)CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_l2_ops);
-- Inner productCREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_ip_ops);With Product Quantization (8-16x compression):
CREATE INDEX idx_doc_embeddings ON documentsUSING hnsw (embedding vector_cosine_ops)WITH ( quantization = 'product', pq_subquantizers = 8, pq_centroids = 256);Inserting Vectors
From Application Code:
use heliosdb_lite::EmbeddedDatabase;
let db = EmbeddedDatabase::new_in_memory()?;
// Create tabledb.execute(" CREATE TABLE documents ( id INT PRIMARY KEY, content TEXT, embedding VECTOR(384) )")?;
// Sample embedding (in real app, from OpenAI/Cohere/etc.)let embedding = vec![0.1, 0.2, 0.3, /* ... 381 more values */];let embedding_str = format!("[{}]", embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
// Insert with embeddingdb.execute(&format!( "INSERT INTO documents (id, content, embedding) VALUES (1, 'Rust tutorial', '{}')", embedding_str))?;SQL with Literal Vector:
INSERT INTO documents (id, content, embedding) VALUES (1, 'Machine learning basics', '[0.12, 0.34, 0.56, ...]'), (2, 'Deep learning tutorial', '[0.15, 0.38, 0.52, ...]');Searching with Vectors
Find Top-K Similar Items:
-- Find 10 most similar documents (cosine distance)SELECT id, content, embedding <=> '[0.1, 0.2, 0.3, ...]' AS distanceFROM documentsORDER BY distanceLIMIT 10;Distance Operators:
<->: L2 distance (Euclidean)<=>: Cosine distance (most common)<#>: Inner product (dot product)
With Filters:
-- Semantic search within categorySELECT id, content, embedding <=> $1 AS distanceFROM documentsWHERE category = 'technology' AND published_date > '2025-01-01'ORDER BY distanceLIMIT 10;Distance Threshold:
-- Only return very similar itemsSELECT id, content, embedding <=> $1 AS distanceFROM documentsWHERE (embedding <=> $1) < 0.3 -- Distance thresholdORDER BY distance;Real-World Example: RAG System
use heliosdb_lite::EmbeddedDatabase;
// Initialize databaselet db = EmbeddedDatabase::new("./rag_db")?;
// Create knowledge base tabledb.execute(" CREATE TABLE knowledge_base ( id BIGINT PRIMARY KEY, source TEXT, content TEXT, embedding VECTOR(1536) -- OpenAI embeddings )")?;
// Create indexdb.execute(" CREATE INDEX idx_kb_embeddings ON knowledge_base USING hnsw (embedding vector_cosine_ops) WITH (quantization = 'product', pq_subquantizers = 8)")?;
// Function to get embedding from OpenAIasync fn get_embedding(text: &str) -> Vec<f32> { // Call OpenAI API // Return embedding}
// Index a documentasync fn index_document(db: &EmbeddedDatabase, doc: &str) -> Result<()> { let embedding = get_embedding(doc).await; let embedding_str = format!("[{}]", embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
db.execute(&format!( "INSERT INTO knowledge_base (id, source, content, embedding) VALUES ({}, 'docs', '{}', '{}')", generate_id(), doc, embedding_str ))?; Ok(())}
// Retrieve relevant contextasync fn retrieve_context(db: &EmbeddedDatabase, query: &str, k: usize) -> Result<Vec<String>> { let query_embedding = get_embedding(query).await; let embedding_str = format!("[{}]", query_embedding.iter() .map(|v| v.to_string()) .collect::<Vec<_>>() .join(", "));
let results = db.query(&format!( "SELECT content FROM knowledge_base ORDER BY embedding <=> '{}' LIMIT {}", embedding_str, k ), &[])?;
Ok(results.iter() .map(|row| row.get(0).unwrap().to_string()) .collect())}
// Usage in RAG pipelineasync fn generate_answer(query: &str) -> String { let db = EmbeddedDatabase::new("./rag_db").unwrap();
// 1. Retrieve relevant context let context = retrieve_context(&db, query, 5).await.unwrap();
// 2. Build prompt let prompt = format!( "Context: {}\n\nQuestion: {}\n\nAnswer:", context.join("\n\n"), query );
// 3. Call LLM (GPT-4, Claude, etc.) call_llm(&prompt).await}Vector Performance Tips
1. Use Product Quantization for Large Datasets:
-- 100K+ vectors: Use PQ to save memoryCREATE INDEX idx WITH (quantization = 'product', pq_subquantizers = 8);2. Batch Insert Vectors:
// Better: Batch insertlet mut values = Vec::new();for (id, content, embedding) in documents { values.push(format!("({}, '{}', '{}')", id, content, embedding_str));}db.execute(&format!("INSERT INTO documents VALUES {}", values.join(", ")))?;
// Worse: Individual insertsfor doc in documents { db.execute(&format!("INSERT INTO documents VALUES ..."))?;}3. Choose Right Distance Metric:
- Cosine: Normalized embeddings (OpenAI, most models) - USE THIS if unsure
- L2: Absolute distances matter
- Inner Product: Pre-normalized vectors, maximum similarity
4. Tune HNSW Parameters:
CREATE INDEX idx ON documents USING hnsw (embedding vector_cosine_ops)WITH ( m = 16, -- Graph connectivity (higher = better recall, more memory) ef_construction = 200 -- Build quality (higher = better quality, slower build));Time-Travel Queries
Time-travel queries let you view data as it existed at any point in time. This is powerful for auditing, debugging, and recovering from mistakes.
How It Works
HeliosDB-Lite uses Multi-Version Concurrency Control (MVCC) to keep historical versions of data. Every change creates a new version without overwriting the old one.
Storage Retention: By default, all versions are kept. Configure retention policy:
let config = Config { retention_days: Some(30), // Keep 30 days of history ..Default::default()};AS OF TIMESTAMP
Query data as it existed at a specific timestamp:
-- View orders table as of yesterdaySELECT * FROM ordersAS OF TIMESTAMP '2025-11-20 00:00:00';
-- Count orders from last weekSELECT COUNT(*) FROM ordersAS OF TIMESTAMP '2025-11-14 00:00:00';
-- Compare current vs historicalSELECT current.product_id, current.price as current_price, historical.price as yesterday_price, current.price - historical.price as price_changeFROM products currentJOIN products AS OF TIMESTAMP '2025-11-20 00:00:00' historical ON current.product_id = historical.product_idWHERE current.price != historical.price;AS OF TRANSACTION
Query data as of a specific transaction ID:
-- View data at transaction 12345SELECT * FROM orders AS OF TRANSACTION 12345;
-- Get transaction IDSELECT pg_current_xact_id(); -- Returns current transaction IDAS OF SCN
Query data as of a System Change Number:
-- View data at SCN 987654321SELECT * FROM orders AS OF SCN 987654321;Real-World Use Cases
1. Audit Trail:
-- Show what customer saw when they complainedSELECT * FROM product_catalogAS OF TIMESTAMP '2025-11-15 14:23:00'WHERE product_id = 'WIDGET-123';
-- Verify price at time of purchaseSELECT price FROM productsAS OF TIMESTAMP (SELECT created_at FROM orders WHERE order_id = 5678)WHERE product_id = 42;2. Data Recovery:
-- Find accidentally deleted rowsSELECT * FROM customersAS OF TIMESTAMP '2025-11-20 23:00:00' -- Before deletionWHERE customer_id NOT IN (SELECT customer_id FROM customers);
-- Restore deleted dataINSERT INTO customersSELECT * FROM customers AS OF TIMESTAMP '2025-11-20 23:00:00'WHERE customer_id = 999;3. Debugging:
-- What changed between two points in time?SELECT before.order_id, before.status as old_status, after.status as new_status, before.total as old_total, after.total as new_totalFROM orders AS OF TIMESTAMP '2025-11-20 00:00:00' beforeFULL OUTER JOIN orders AS OF TIMESTAMP '2025-11-21 00:00:00' after USING (order_id)WHERE before.status != after.status OR before.total != after.total;4. Trend Analysis:
-- Daily inventory levels over past weekSELECT '2025-11-14' as date, SUM(stock) FROM products AS OF TIMESTAMP '2025-11-14 23:59:59'UNION ALLSELECT '2025-11-15', SUM(stock) FROM products AS OF TIMESTAMP '2025-11-15 23:59:59'UNION ALLSELECT '2025-11-16', SUM(stock) FROM products AS OF TIMESTAMP '2025-11-16 23:59:59'-- ... etc5. Compliance and Reporting:
-- End-of-quarter snapshot for financial reportingSELECT customer_id, SUM(total_price) as quarterly_revenueFROM ordersAS OF TIMESTAMP '2025-09-30 23:59:59'WHERE created_at BETWEEN '2025-07-01' AND '2025-09-30'GROUP BY customer_id;Performance Considerations
1. Recent Data is Faster:
- Queries on recent timestamps are faster (fewer versions to traverse)
- Very old timestamps may be slower
2. Index Usage:
-- Indexes work with time-travel queriesCREATE INDEX idx_orders_date ON orders(created_at);
SELECT * FROM ordersAS OF TIMESTAMP '2025-11-20 00:00:00'WHERE created_at > '2025-11-01'; -- Uses index3. Storage Management:
// Configure retention to manage storagelet config = Config { retention_days: Some(90), // Keep 90 days compact_interval_hours: 24, // Compact daily ..Default::default()};Database Branching
Database branching brings git-like workflows to your database. Create isolated copies, make changes, test thoroughly, then merge or discard.
Why Use Branching?
Traditional Problem:
- Testing on production is risky
- Staging databases drift from production
- Rolling back is hard
- Zero-downtime deployments are complex
Branching Solution:
- Create instant branch from production
- Test changes in isolation
- Merge back when ready
- Or discard if tests fail
Creating Branches
Basic Branch:
-- Create branch from current stateCREATE DATABASE BRANCH staging FROM CURRENT AS OF NOW;Branch from Specific Time:
-- Create branch from yesterdayCREATE DATABASE BRANCH test FROM mainAS OF TIMESTAMP '2025-11-20 00:00:00';Branch from Transaction:
-- Create branch from specific transactionCREATE DATABASE BRANCH backup FROM CURRENTAS OF TRANSACTION 987654;With Options:
CREATE DATABASE BRANCH prod_copy FROM main AS OF NOWWITH ( replication_factor = 3, region = 'us-west-2');Working with Branches
Switch Between Branches:
// In application codelet db = EmbeddedDatabase::new_branch("./mydb", "staging")?;List Branches:
-- View all branchesSELECT * FROM pg_database_branches();
-- Output:-- branch_name | parent_branch | created_at | size_bytes-- main | NULL | 2025-11-01 00:00:00 | 157286400-- staging | main | 2025-11-20 10:00:00 | 157286400-- dev | main | 2025-11-20 11:00:00 | 157286400Branch Metadata:
-- Check branch detailsSELECT branch_name, ROUND(size_mb::FLOAT / 1024, 2) as size_gb, created_at, statusFROM pg_database_branches()WHERE status = 'active';Merging Branches
Simple Merge:
-- Merge staging into mainMERGE DATABASE BRANCH staging INTO main;Merge with Conflict Resolution:
MERGE DATABASE BRANCH staging INTO mainWITH ( conflict_resolution = 'branch_wins', -- Source wins conflicts delete_branch_after = true -- Auto-cleanup);Conflict Resolution Options:
branch_wins: Source branch changes override targettarget_wins: Target branch changes override sourcefail: Fail on any conflict (manual resolution required)
Deleting Branches
-- Delete branchDROP DATABASE BRANCH staging;
-- Delete if exists (no error)DROP DATABASE BRANCH IF EXISTS staging;Real-World Workflows
1. Development Workflow:
-- 1. Create dev branchCREATE DATABASE BRANCH dev FROM main AS OF NOW;
-- 2. Developer works in dev branch-- (make changes, test, iterate)
-- 3. Run tests-- (application-level testing)
-- 4. Merge to staging for QACREATE DATABASE BRANCH staging FROM dev AS OF NOW;
-- 5. QA approves, merge to mainMERGE DATABASE BRANCH staging INTO mainWITH (conflict_resolution = 'branch_wins', delete_branch_after = true);
-- 6. Cleanup dev branchDROP DATABASE BRANCH dev;2. Blue-Green Deployment:
-- 1. Production runs on 'blue' branch-- (current production environment)
-- 2. Create 'green' branch for new versionCREATE DATABASE BRANCH green FROM blue AS OF NOW;
-- 3. Deploy new application version to green-- (application connects to green branch)
-- 4. Run tests on green-- (smoke tests, integration tests)
-- 5. Switch traffic to green (at application layer)-- (update connection string to use green)
-- 6. Monitor for issues-- (if issues found, switch back to blue instantly)
-- 7. If successful, make green the new blueMERGE DATABASE BRANCH green INTO blue;DROP DATABASE BRANCH green;CREATE DATABASE BRANCH green FROM blue AS OF NOW; -- Prepare for next deployment3. Testing Schema Changes:
-- 1. Create test branchCREATE DATABASE BRANCH schema_test FROM main AS OF NOW;
-- 2. Apply schema changes in test branchALTER TABLE users ADD COLUMN preferences JSONB;CREATE INDEX idx_user_prefs ON users(preferences);
-- 3. Test queriesSELECT * FROM users WHERE preferences->>'theme' = 'dark';
-- 4. If tests pass, apply to mainMERGE DATABASE BRANCH schema_test INTO main;
-- 5. CleanupDROP DATABASE BRANCH schema_test;4. A/B Testing:
-- 1. Create two branches for A/B testCREATE DATABASE BRANCH variant_a FROM main AS OF NOW;CREATE DATABASE BRANCH variant_b FROM main AS OF NOW;
-- 2. Apply different changes to each-- In variant_a:UPDATE product_recommendations SET algorithm = 'collaborative_filtering';
-- In variant_b:UPDATE product_recommendations SET algorithm = 'content_based';
-- 3. Run A/B test (application routes users to branches)-- (collect metrics, analyze results)
-- 4. Merge winning variantMERGE DATABASE BRANCH variant_a INTO main;
-- 5. CleanupDROP DATABASE BRANCH variant_a;DROP DATABASE BRANCH variant_b;5. Disaster Recovery:
-- Regular backups as branchesCREATE DATABASE BRANCH backup_daily FROM main AS OF NOW;
-- If disaster strikes:-- 1. Create new main from backupCREATE DATABASE BRANCH main_new FROM backup_daily AS OF NOW;
-- 2. Verify data integritySELECT COUNT(*) FROM users;
-- 3. Replace mainDROP DATABASE BRANCH main;ALTER DATABASE BRANCH main_new RENAME TO main;Branch Management Best Practices
1. Naming Convention:
<purpose>-<timestamp><purpose>-<version><team>-<feature>
Examples:- dev-20251121- staging-v2.1- data-team-analytics- backup-daily2. Lifecycle:
Create → Work → Test → Merge → Delete3. Monitor Size:
-- Find large branchesSELECT branch_name, size_mbFROM pg_database_branches()WHERE size_mb > 1000ORDER BY size_mb DESC;4. Automated Cleanup:
-- Drop old test branches (pseudocode for scheduled job)DO $$DECLARE branch_rec RECORD;BEGIN FOR branch_rec IN SELECT branch_name FROM pg_database_branches() WHERE branch_name LIKE 'test-%' AND created_at < NOW() - INTERVAL '7 days' LOOP EXECUTE 'DROP DATABASE BRANCH ' || branch_rec.branch_name; END LOOP;END $$;Materialized Views
Materialized Views (MVs) are precomputed query results stored as tables. They speed up complex queries at the cost of some staleness.
Why Use Materialized Views?
Problem: Complex aggregate queries are slow
-- This query is slow on millions of rowsSELECT customer_id, COUNT(*) as order_count, SUM(total) as revenue, AVG(total) as avg_order_valueFROM ordersGROUP BY customer_id;Solution: Precompute and cache the results
CREATE MATERIALIZED VIEW customer_stats ASSELECT customer_id, COUNT(*) as order_count, SUM(total) as revenue, AVG(total) as avg_order_valueFROM ordersGROUP BY customer_id;
-- Now this is instantSELECT * FROM customer_stats WHERE customer_id = 123;Creating Materialized Views
Basic MV:
CREATE MATERIALIZED VIEW daily_sales ASSELECT DATE(created_at) as date, COUNT(*) as orders, SUM(total) as revenueFROM ordersGROUP BY DATE(created_at);MV with Indexes:
-- Create MVCREATE MATERIALIZED VIEW user_stats ASSELECT user_id, COUNT(*) as order_count, SUM(total) as revenueFROM ordersGROUP BY user_id;
-- Add index for fast lookupsCREATE INDEX idx_user_stats_revenue ON user_stats(revenue DESC);Auto-Refresh MV:
CREATE MATERIALIZED VIEW product_popularity ASSELECT product_id, COUNT(*) as view_count, COUNT(DISTINCT user_id) as unique_viewersFROM product_viewsGROUP BY product_idWITH ( auto_refresh = true, -- Enable automatic refresh max_cpu_percent = 15, -- Limit CPU usage to 15% threshold_dml_rate = 100, -- Refresh when 100+ DMLs/sec lazy_update = true, -- Defer during high load lazy_catchup_window = '1 hour' -- Max staleness allowed);Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
auto_refresh | boolean | false | Enable background auto-refresh |
max_cpu_percent | float | 15.0 | Maximum CPU usage (0-100) |
threshold_table_size | string | ’0’ | Min base table size to trigger refresh |
threshold_dml_rate | int | 0 | DMLs/sec to trigger refresh |
lazy_update | boolean | false | Defer refresh during high load |
lazy_catchup_window | string | ’1h’ | Maximum staleness allowed |
Refreshing Materialized Views
Manual Refresh:
-- Blocking refresh (locks the MV)REFRESH MATERIALIZED VIEW user_stats;
-- Concurrent refresh (non-blocking, users can still query)REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;Automatic Refresh:
-- HeliosDB-Lite automatically refreshes based on:-- 1. DML rate (inserts/updates/deletes per second)-- 2. CPU availability (respects max_cpu_percent)-- 3. Staleness (lazy_catchup_window)
-- Check auto-refresh statusSELECT * FROM pg_mv_staleness();Monitoring Materialized Views
Check Staleness:
SELECT view_name, staleness_sec / 60 as staleness_minutes, pending_changes, statusFROM pg_mv_staleness()ORDER BY staleness_sec DESC;Output:
view_name | staleness_minutes | pending_changes | status-----------------+-------------------+-----------------+-----------user_stats | 12.5 | 1250 | staleproduct_stats | 0.5 | 50 | freshdaily_sales | 0.0 | 0 | freshCheck CPU Usage:
SELECT view_name, max_cpu_percent, current_cpu_percent, statusFROM pg_mv_cpu_usage();Dropping Materialized Views
-- Drop MVDROP MATERIALIZED VIEW user_stats;
-- Drop if existsDROP MATERIALIZED VIEW IF EXISTS user_stats;Real-World Use Cases
1. Analytics Dashboard:
-- Instead of slow live queriesCREATE MATERIALIZED VIEW dashboard_metrics ASSELECT DATE(created_at) as date, COUNT(*) as total_orders, SUM(total) as revenue, COUNT(DISTINCT customer_id) as unique_customers, AVG(total) as avg_order_valueFROM ordersGROUP BY DATE(created_at)WITH ( auto_refresh = true, max_cpu_percent = 10, threshold_dml_rate = 50);
-- Dashboard queries are now instantSELECT * FROM dashboard_metricsWHERE date >= CURRENT_DATE - INTERVAL '30 days'ORDER BY date DESC;2. Leaderboard:
CREATE MATERIALIZED VIEW user_leaderboard ASSELECT user_id, users.username, COUNT(*) as games_played, SUM(score) as total_score, AVG(score) as avg_score, MAX(score) as high_score, RANK() OVER (ORDER BY SUM(score) DESC) as rankFROM game_resultsJOIN users ON game_results.user_id = users.idGROUP BY user_id, users.usernameWITH ( auto_refresh = true, max_cpu_percent = 20, lazy_catchup_window = '5 minutes');
-- Leaderboard API is fastSELECT * FROM user_leaderboard ORDER BY rank LIMIT 100;3. E-commerce Product Stats:
CREATE MATERIALIZED VIEW product_stats ASSELECT products.product_id, products.name, COUNT(DISTINCT orders.customer_id) as buyers, COUNT(orders.order_id) as times_purchased, SUM(orders.quantity) as units_sold, SUM(orders.total) as revenue, AVG(reviews.rating) as avg_rating, COUNT(reviews.review_id) as review_countFROM productsLEFT JOIN orders ON products.product_id = orders.product_idLEFT JOIN reviews ON products.product_id = reviews.product_idGROUP BY products.product_id, products.nameWITH (auto_refresh = true, max_cpu_percent = 15);
-- Product page loads instantlySELECT * FROM product_stats WHERE product_id = 123;4. Real-Time Inventory:
CREATE MATERIALIZED VIEW inventory_status ASSELECT warehouse_id, product_id, SUM(quantity) as on_hand, SUM(CASE WHEN status = 'reserved' THEN quantity ELSE 0 END) as reserved, SUM(quantity) - SUM(CASE WHEN status = 'reserved' THEN quantity ELSE 0 END) as availableFROM inventoryGROUP BY warehouse_id, product_idWITH ( auto_refresh = true, max_cpu_percent = 25, threshold_dml_rate = 200, -- High rate for inventory updates lazy_catchup_window = '30 seconds' -- Near real-time);Performance Tuning
1. Start Conservative:
-- Begin with safe settingsWITH ( max_cpu_percent = 10, lazy_update = true, lazy_catchup_window = '1 hour')
-- Gradually increase based on monitoring2. Index the MV:
-- Create indexes for common queriesCREATE INDEX idx_user_stats_revenue ON user_stats(revenue DESC);CREATE INDEX idx_daily_sales_date ON daily_sales(date DESC);3. Partition Large MVs:
-- Instead of one huge MVCREATE MATERIALIZED VIEW sales_2025 ASSELECT * FROM orders WHERE YEAR(created_at) = 2025;
CREATE MATERIALIZED VIEW sales_2024 ASSELECT * FROM orders WHERE YEAR(created_at) = 2024;4. Choose Right Refresh Strategy:
-- Low-traffic: Aggressive refreshWITH (max_cpu_percent = 30, threshold_dml_rate = 10)
-- High-traffic: Conservative refreshWITH (max_cpu_percent = 10, lazy_update = true)
-- Critical: Manual refresh during maintenance windowWITH (auto_refresh = false)-- Then: REFRESH MATERIALIZED VIEW CONCURRENTLY during off-peakNext Steps
Congratulations! You’ve learned the fundamentals of HeliosDB-Lite. Here’s what to explore next:
Advanced Features
1. Multi-Tenancy:
-- Row-Level Security for SaaS applicationsCREATE TABLE documents ( id BIGINT PRIMARY KEY, tenant_id INT NOT NULL, content TEXT);
CREATE POLICY tenant_isolation ON documents USING (tenant_id = current_tenant_id());2. Full-Text Search:
-- Coming in v2.1CREATE INDEX idx_fts ON documents USING gin(to_tsvector('english', content));SELECT * FROM documents WHERE to_tsvector(content) @@ to_tsquery('database & vector');3. JSONB Operations:
-- Store and query JSON efficientlyCREATE TABLE users ( id INT PRIMARY KEY, metadata JSONB);
INSERT INTO users VALUES (1, '{"name": "Alice", "age": 30, "tags": ["vip", "premium"]}');
SELECT * FROM users WHERE metadata->>'age' > '28';SELECT * FROM users WHERE metadata->'tags' ? 'vip';4. Encryption:
use heliosdb_lite::{EmbeddedDatabase, Config};use heliosdb_lite::crypto::EncryptionKey;
// Generate encryption keylet key = EncryptionKey::generate()?;
// Create encrypted databaselet config = Config { encryption_key: Some(key), ..Default::default()};let db = EmbeddedDatabase::with_config("./secure_db", config)?;
// All data is transparently encryptedLearning Resources
Documentation:
- SQL Reference - Complete SQL syntax
- Vector Search Guide - Deep dive into embeddings
- Phase 3 User Guide - Advanced v2.0 features
- System Catalog - System views reference
- API Documentation - Rust API reference
Examples:
# Browse example codels examples/# - quickstart.rs (basics)# - vector_search_demo.rs (semantic search)# - encryption_demo.rs (TDE)# - concurrent_mv_refresh_demo.rs (materialized views)Configuration:
Integration Guides
ORM Support:
# Read ORM integration guidecat docs/ORM_SUPPORT.mdServer Mode:
# Start PostgreSQL-compatible serverheliosdb-lite server --port 5432 --data ./mydb
# Connect with any PostgreSQL clientpsql postgresql://localhost:5432/mydbMigration:
# Migrate from PostgreSQL/SQLitecat docs/MIGRATION.mdCommunity and Support
Get Help:
- GitHub Issues: https://github.com/dimensigon/HeliosDB-Lite/issues
- Documentation: https://docs.heliosdb.com/lite
- Community Forum: [Coming Soon]
Contributing:
- Read CONTRIBUTING.md
- Check open issues
- Submit pull requests
Stay Updated:
- Watch the repository
- Follow release notes
- Join mailing list [Coming Soon]
Project Ideas
1. Build a RAG System:
- Index documentation with embeddings
- Semantic search for relevant context
- Feed to LLM for accurate answers
2. E-commerce Backend:
- Multi-tenant product catalog
- Vector search for product recommendations
- Materialized views for analytics
3. Git-like Version Control:
- Use database branching for workflows
- Time-travel for audit trails
- Blue-green deployments
4. Real-Time Dashboard:
- Materialized views for aggregates
- Auto-refresh for near real-time data
- Vector clustering for insights
5. Mobile App Backend:
- Embedded mode for offline-first
- Server mode for sync
- Encryption for security
Quick Reference Card
Common Commands:
-- TablesCREATE TABLE name (col TYPE);DROP TABLE name;\d name -- Describe table (REPL)
-- CRUDINSERT INTO table VALUES (...);SELECT * FROM table WHERE condition;UPDATE table SET col = val WHERE condition;DELETE FROM table WHERE condition;
-- IndexesCREATE INDEX idx_name ON table(col);CREATE INDEX idx_vec ON table USING hnsw (vec vector_cosine_ops);
-- TransactionsBEGIN; ... COMMIT;BEGIN; ... ROLLBACK;
-- Vector SearchSELECT * FROM docs ORDER BY embedding <=> '[...]' LIMIT 10;
-- Time-TravelSELECT * FROM table AS OF TIMESTAMP '2025-11-20 00:00:00';
-- BranchingCREATE DATABASE BRANCH dev FROM main AS OF NOW;MERGE DATABASE BRANCH dev INTO main;DROP DATABASE BRANCH dev;
-- Materialized ViewsCREATE MATERIALIZED VIEW mv AS SELECT ...;REFRESH MATERIALIZED VIEW mv;DROP MATERIALIZED VIEW mv;
-- System ViewsSELECT * FROM pg_database_branches();SELECT * FROM pg_mv_staleness();SELECT * FROM pg_vector_index_stats();REPL Meta-Commands:
\h Help\d List tables\d table Describe table\dS List system views\dS view Describe system view\q QuitRust API:
// Create databaselet db = EmbeddedDatabase::new("./mydb")?;let db = EmbeddedDatabase::new_in_memory()?;
// Execute SQLdb.execute("CREATE TABLE ...")?;let count = db.execute("INSERT ...")?;
// Querylet results = db.query("SELECT * FROM users", &[])?;
// Transactionlet tx = db.begin_transaction()?;tx.execute("INSERT ...")?;tx.commit()?;Troubleshooting
Problem: “Table already exists” error
-- Solution: Use IF NOT EXISTSCREATE TABLE IF NOT EXISTS users (...);Problem: Vector dimensions mismatch
-- Solution: Ensure all vectors have same dimensionCREATE TABLE docs (embedding VECTOR(384)); -- All must be 384-dimProblem: Slow vector search
-- Solution: Create HNSW indexCREATE INDEX idx ON docs USING hnsw (embedding vector_cosine_ops);Problem: Materialized view is stale
-- Solution: Manual refreshREFRESH MATERIALIZED VIEW mv;
-- Or check auto-refresh settingsSELECT * FROM pg_mv_staleness();Problem: Out of memory with large vectors
-- Solution: Use Product QuantizationCREATE INDEX idx ON docs USING hnsw (embedding vector_cosine_ops)WITH (quantization = 'product', pq_subquantizers = 8);Glossary
- ACID: Atomicity, Consistency, Isolation, Durability (transaction guarantees)
- Embedding: Vector representation of data (text, image, etc.)
- HNSW: Hierarchical Navigable Small World (vector index algorithm)
- MVCC: Multi-Version Concurrency Control (enables time-travel)
- MV: Materialized View (cached query result)
- PQ: Product Quantization (vector compression technique)
- RAG: Retrieval-Augmented Generation (LLM + knowledge base)
- Semantic Search: Search by meaning, not keywords
- TDE: Transparent Data Encryption
Document Version: 1.0 Last Updated: November 21, 2025 Applies To: HeliosDB-Lite v2.0.0 and later
For questions or feedback, please open an issue on GitHub.
Happy coding with HeliosDB-Lite! 🚀