HeliosDB Nano SQLite Layer Architecture - User Guide
HeliosDB Nano SQLite Layer Architecture - User Guide
A user-friendly explanation of how HeliosDB Nano works under the hood, designed for developers who want to understand the system without diving into the source code.
Overview: What Makes HeliosDB Nano Different?
HeliosDB Nano is built from the ground up in Rust for performance, safety, and advanced features - but it’s designed to look and feel exactly like SQLite from a Python perspective.
The magic: HeliosDB Nano implements the Python DB-API 2.0 interface (same as SQLite) while using a completely different engine underneath.
┌─────────────────────────────────────────────────────────────┐│ Your Python Application ││ import heliosdb_sqlite as sqlite3 │└──────────────────────────────┬──────────────────────────────┘ │ │ Python DB-API 2.0 │ (Same interface as SQLite) │┌──────────────────────────────▼──────────────────────────────┐│ HeliosDB Nano SQLite Compatibility Layer ││ (Python bindings) │└──────────────────────────────┬──────────────────────────────┘ │ │ PyO3 (Python ↔ Rust bridge) │┌──────────────────────────────▼──────────────────────────────┐│ HeliosDB Nano Core Engine ││ (Written in Rust) ││ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ SQL Parser & │ │ Query Exec │ │ Storage │ ││ │ Planner │ │ Engine │ │ Engine │ ││ └──────────────┘ └──────────────┘ └──────────────┘ ││ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ Concurrency │ │ Vector Search│ │ Time-Travel │ ││ │ Control │ │ (HNSW) │ │ (MVCC) │ ││ └──────────────┘ └──────────────┘ └──────────────┘ │└──────────────────────────────────────────────────────────┘How HeliosDB Nano SQLite Layer Works
When You Import
import heliosdb_sqlite as sqlite3What happens:
- Python loads the HeliosDB Nano module (compiled Rust library with Python bindings)
- The module registers itself as a SQLite-compatible database driver
- All standard DB-API 2.0 methods (connect, execute, etc.) become available
- No configuration required - it just works
Behind the scenes:
- The Python module is a thin wrapper around the Rust core
- PyO3 (Python-Rust bridge) handles type conversions automatically
- Function calls are fast (near-native performance)
When You Connect to a Database
conn = sqlite3.connect('myapp.db')What happens:
-
File detection:
- HeliosDB Nano checks if the file exists
- If it’s an existing SQLite database, it can read it directly
- If it’s a new file, it creates a HeliosDB Nano database
-
Engine initialization:
- Opens or creates the database file
- Initializes the storage engine
- Sets up transaction management
- Prepares connection handle
-
Connection object returned:
- You get a Python connection object
- It implements all standard SQLite connection methods
- Plus optional HeliosDB Nano advanced features
File format:
- HeliosDB Nano uses a compatible format for basic data
- SQLite can read HeliosDB Nano databases (for basic tables)
- HeliosDB Nano can read SQLite databases
- Advanced features (vectors, branches) stored in extended sections
When You Execute a Query
cursor = conn.cursor()cursor.execute('SELECT * FROM users WHERE age > ?', (18,))results = cursor.fetchall()What happens:
1. Python → Rust Bridge ┌─────────────────────────────────────────────────────┐ │ cursor.execute(sql, params) │ │ ↓ │ │ PyO3 converts Python types to Rust types │ │ ↓ │ │ Rust receives: &str (SQL) + Vec<Value> (params) │ └─────────────────────────────────────────────────────┘
2. SQL Parsing ┌─────────────────────────────────────────────────────┐ │ Parse SQL into Abstract Syntax Tree (AST) │ │ Validate syntax │ │ Extract table references, columns, conditions │ └─────────────────────────────────────────────────────┘
3. Query Planning ┌─────────────────────────────────────────────────────┐ │ Analyze query and data statistics │ │ Choose optimal execution strategy │ │ Determine index usage │ │ Optimize joins and filters │ └─────────────────────────────────────────────────────┘
4. Execution ┌─────────────────────────────────────────────────────┐ │ Access storage engine for table data │ │ Apply filters (WHERE age > 18) │ │ Use indexes if available │ │ Collect matching rows │ └─────────────────────────────────────────────────────┘
5. Result Conversion ┌─────────────────────────────────────────────────────┐ │ Convert Rust data structures to Python types │ │ List of tuples/rows returned to Python │ │ Cursor stores results for fetchall()/fetchone() │ └─────────────────────────────────────────────────────┘Performance optimization:
- Prepared statement caching
- Query plan reuse
- Zero-copy operations where possible
- Parallel execution for complex queries
The Three Modes Explained
HeliosDB Nano can run in three different modes, giving you flexibility in how you deploy and use it.
Mode 1: Embedded Mode (Default)
How it works:
conn = sqlite3.connect('app.db') # Default mode┌──────────────────────────────────────────────┐│ Your Python Application ││ ││ ┌────────────────────────────────────┐ ││ │ HeliosDB Nano Engine │ ││ │ (runs in same process) │ ││ └────────────────────────────────────┘ ││ ↓ ││ Database File ││ (app.db) │└──────────────────────────────────────────────┘Characteristics:
- Database engine runs inside your application process
- Direct memory access (fastest)
- No network overhead
- Similar to how SQLite works
- Perfect for single-application deployments
Use when:
- Single application accessing database
- Maximum performance needed
- Embedded/desktop applications
- Simple deployments
Mode 2: Server Mode (PostgreSQL-Compatible)
How it works:
server = sqlite3.start_server('app.db', port=5432)┌─────────────────┐ ┌─────────────────┐│ Application 1 │ │ Application 2 ││ (Python) │ │ (Node.js) │└────────┬────────┘ └────────┬────────┘ │ │ │ PostgreSQL Protocol │ │ │ └────────────┬──────────────┘ │ ┌────────────▼─────────────┐ │ HeliosDB Nano Server │ │ (Network daemon) │ │ │ │ ┌──────────────┐ │ │ │ Query Router │ │ │ └──────┬───────┘ │ │ │ │ │ ┌──────▼────────┐ │ │ │ Core Engine │ │ │ └──────┬────────┘ │ └───────────┼──────────────┘ │ ┌──────▼──────┐ │ Database │ │ (app.db) │ └─────────────┘Characteristics:
- Database runs as separate server process
- Multiple clients can connect
- PostgreSQL wire protocol (use psycopg2, pg, etc.)
- Network overhead, but enables multi-client access
- Horizontal scaling possible
Use when:
- Multiple applications need database access
- Microservices architecture
- Remote database access needed
- Team development (shared database)
Mode 3: Hybrid Mode (Best of Both Worlds)
How it works:
conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')┌─────────────────────────────────────────────┐│ Your Python Application ││ ││ ┌──────────────────────────────────────┐ ││ │ Direct Embedded Access (Fast) │ ││ │ ↓ │ ││ │ HeliosDB Nano Core Engine │ ││ └──────────────┬───────────────────────┘ ││ │ │└─────────────────┼───────────────────────────┘ │ ┌────────▼─────────┐ │ │ │ Database File │ │ │ └────────▲─────────┘ │┌─────────────────┼───────────────────────────┐│ Server Thread │ ││ ↓ │ ││ PostgreSQL Protocol Server ││ (Accepts remote connections) │└────────────────────────────────────────────┘ ↑ │ PostgreSQL Protocol │┌────────┴─────────┐│ Remote Client ││ (Dashboard, ││ Monitoring, ││ Admin tools) │└──────────────────┘Characteristics:
- Your application gets direct embedded access (fast!)
- Server thread runs in background for remote access
- Same database, different access methods
- No performance penalty for local access
- Enables remote monitoring/management
Use when:
- Main application needs fast local access
- Also need remote admin/monitoring
- Want flexibility without choosing
- Development/production parity
How Advanced Features Work
Vector Search
Architecture:
cursor.execute(''' SELECT id, title, vector_distance(embedding, ?, 'cosine') as similarity FROM articles ORDER BY similarity LIMIT 10''', (query_embedding,))What happens:
- Vector column stored as special data type (VECTOR(n))
- HNSW (Hierarchical Navigable Small World) index created automatically
- Query planner recognizes vector_distance function
- Uses specialized vector search algorithm (not full table scan)
- Results ranked by similarity
Storage:
Regular Table Storage:┌──────┬────────┬──────────────────────────────────┐│ id │ title │ content │├──────┼────────┼──────────────────────────────────┤│ 1 │ "AI" │ "Machine learning introduction" │└──────┴────────┴──────────────────────────────────┘
Vector Extension Storage:┌──────┬──────────────────────────────────────────┐│ id │ embedding (384-dim vector) │├──────┼──────────────────────────────────────────┤│ 1 │ [0.23, 0.45, 0.12, ..., 0.67] │└──────┴──────────────────────────────────────────┘ ↓ HNSW Index (graph structure for fast search)Performance:
- Approximate nearest neighbor search: O(log n)
- Much faster than brute force: O(n)
- Trade-off: slight accuracy for massive speed
Time-Travel Queries
Architecture:
cursor.execute(''' SELECT * FROM users AS OF TIMESTAMP '2024-01-01 10:00:00'''')How it works:
MVCC (Multi-Version Concurrency Control)
Current Data View:┌──────┬─────────┬──────────────────────┐│ id │ name │ version/timestamp │├──────┼─────────┼──────────────────────┤│ 1 │ "Alice" │ 2024-06-01 15:00 │└──────┴─────────┴──────────────────────┘
Historical Versions (stored separately):┌──────┬─────────┬──────────────────────┐│ id │ name │ version/timestamp │├──────┼─────────┼──────────────────────┤│ 1 │ "A" │ 2024-01-01 09:00 ││ 1 │ "Al" │ 2024-01-01 10:30 ││ 1 │ "Alice" │ 2024-06-01 15:00 │ ← Current└──────┴─────────┴──────────────────────┘
Query with AS OF TIMESTAMP:- Looks up version <= specified timestamp- Returns historical state- No impact on current dataStorage overhead:
- Only changed values stored (delta compression)
- Configurable retention period
- Automatic cleanup of old versions
Database Branching
Architecture:
branch = conn.create_branch('experiment')branch_conn = sqlite3.connect('app.db', branch='experiment')How it works:
Main Database:┌────────────────────────────────────┐│ Main branch (active) ││ ││ ┌──────────────────────────────┐ ││ │ Table: users │ ││ │ 1000 rows │ ││ └──────────────────────────────┘ │└────────────────────────────────────┘ │ │ create_branch('experiment') │ ▼┌────────────────────────────────────┐│ Experiment branch (copy-on-write) ││ ││ ┌──────────────────────────────┐ ││ │ Only stores CHANGES │ ││ │ from main branch │ ││ │ │ ││ │ Modified rows: 50 │ ││ │ Deleted rows: 10 │ ││ │ New rows: 5 │ ││ └──────────────────────────────┘ │└────────────────────────────────────┘
Reading from branch:1. Check if row modified in branch → use branch version2. If not modified → use main branch version3. Apply deletes from branch4. Add new rows from branch
Result: Full database view without duplicating unchanged dataBenefits:
- Instant branch creation (no copying)
- Minimal storage overhead
- Safe experimentation
- Easy merge or discard
Data Storage Explanation
File Structure
myapp.db (HeliosDB Nano database file)│├─ Header (metadata)│ ├─ Format version│ ├─ Page size│ ├─ Schema version│ └─ Feature flags│├─ Schema section│ ├─ Table definitions│ ├─ Index definitions│ ├─ View definitions│ └─ Constraints│├─ Data pages (table data)│ ├─ B-tree structure (like SQLite)│ ├─ Row storage│ └─ Index structures│├─ Advanced features section (optional)│ ├─ Vector indexes (HNSW graphs)│ ├─ Time-travel versions (MVCC)│ ├─ Branch metadata│ └─ Encryption metadata│└─ Transaction log (WAL) ├─ Uncommitted changes └─ Recovery informationCompatibility:
- Basic sections compatible with SQLite
- Advanced sections ignored by SQLite (safe)
- HeliosDB Nano reads both formats seamlessly
Performance Characteristics
Memory Usage
Per Connection:- Base overhead: ~100 KB- Query cache: ~10 MB (configurable)- Connection buffers: ~1 MB
Per Table:- Schema metadata: ~1-10 KB- Index overhead: Varies (5-20% of table size)
Vector Search:- HNSW index: ~20-40% of vector data size- In-memory: Optional, speeds up queries 10-100x
Time-Travel:- Historical versions: Configurable retention- Delta compression reduces overhead to ~5-15%Query Performance
Operation | SQLite | HeliosDB Nano----------------------|-------------|---------------Single row read | 10 μs | 12 μs (comparable)Indexed search | 100 μs | 105 μs (comparable)Full table scan | 10 ms | 11 ms (comparable)Single write | 50 μs | 75 μs (slight overhead)Concurrent reads | Excellent | Excellent (same)Concurrent writes | Blocked | Parallel! (major win)Vector search (10k) | N/A | 2 ms (exclusive)Time-travel query | N/A | 15 ms (exclusive)Key takeaway:
- Similar performance for standard operations
- Much better with concurrent writes
- Advanced features add new capabilities (not slower alternatives)
Concurrency Model
How HeliosDB Nano Handles Concurrent Access
SQLite approach (locks):
Thread 1: BEGIN → Write → [Lock database] → COMMIT → [Unlock]Thread 2: [WAIT...] → BEGIN → WriteThread 3: [WAIT...] → BEGINHeliosDB Nano approach (MVCC):
Thread 1: BEGIN → Write to version 1 → COMMITThread 2: BEGIN → Write to version 2 → COMMIT (simultaneous!)Thread 3: BEGIN → Read version 1 → done (doesn't block writes)
Conflict resolution:- Non-conflicting writes: Both succeed- Conflicting writes: Last commit wins (configurable)- Readers never block writers- Writers never block readersImplementation:
Transaction Log:
Time ─────────────────────────────▶
T1: [BEGIN] ────── [Write row A] ────── [COMMIT v1]T2: [BEGIN] ────── [Write row B] ────── [COMMIT v2]T3: [BEGIN] [Read] [COMMIT]
All transactions see consistent snapshotNo locks needed!What Happens When You Import
Detailed Import Process
import heliosdb_sqlite as sqlite3Step-by-step:
-
Python loads .so/.dll (compiled Rust library)
HeliosDB Nano module ────▶ [OS loads shared library] -
Module initialization (happens once)
// Rust code (simplified)#[pymodule]fn heliosdb_sqlite(py: Python, m: &PyModule) -> PyResult<()> {m.add_class::<Connection>()?;m.add_class::<Cursor>()?;m.add_function(wrap_pyfunction!(connect, m)?)?;// ... register all DB-API 2.0 functionsOk(())} -
Python receives module with all methods
# Now available:sqlite3.connect(...)sqlite3.Connectionsqlite3.Cursorsqlite3.Row# etc. - full DB-API 2.0 interface -
No configuration needed - ready to use!
Threading and Async Support
Thread Safety
HeliosDB Nano is thread-safe by default (unlike SQLite):
# This works safely in HeliosDB Nanoconn = sqlite3.connect('app.db', check_same_thread=False)
def worker(thread_id): cursor = conn.cursor() # Safe from multiple threads cursor.execute('INSERT INTO logs VALUES (?, ?)', (thread_id, 'message')) conn.commit()
threads = [Thread(target=worker, args=(i,)) for i in range(100)]# All threads can use same connection safely!How it’s safe:
- Rust’s ownership model prevents data races
- Internal synchronization (lock-free where possible)
- MVCC eliminates most contention
Async/Await Support
import asyncioimport heliosdb_sqlite as sqlite3
async def async_query(): loop = asyncio.get_event_loop()
# Run blocking operation in thread pool def blocking_query(): conn = sqlite3.connect('app.db') cursor = conn.cursor() cursor.execute('SELECT * FROM users') return cursor.fetchall()
results = await loop.run_in_executor(None, blocking_query) return results
# Or use async connection poolfrom heliosdb_sqlite.asyncio import AsyncConnectionPool
pool = AsyncConnectionPool('app.db')async with pool.connection() as conn: cursor = conn.cursor() results = await cursor.execute('SELECT * FROM users')Summary: Why This Architecture Matters
For Drop-In Compatibility:
- Implements exact same Python interface as SQLite
- You change one import line, everything works
- Existing code, queries, databases all compatible
For Performance:
- Rust core = fast and safe
- MVCC = no write locks
- Parallel execution where possible
- Optimized for modern workloads
For Advanced Features:
- Vector search: Built-in HNSW indexes
- Time-travel: MVCC enables historical queries
- Branching: Copy-on-write for safe experimentation
- Encryption: Transparent at storage layer
For Flexibility:
- Three modes: embedded, server, hybrid
- Start simple, scale when needed
- No architectural changes required
What’s Next?
Now that you understand how HeliosDB Nano works:
- Try it out:
HELIOSDB_SQLITE_DROP_IN_GUIDE.md - Migrate your app:
HELIOSDB_SQLITE_MIGRATION_PATTERNS.md - Explore advanced features:
HELIOSDB_SQLITE_ADVANCED_FEATURES.md - Troubleshoot issues:
HELIOSDB_SQLITE_TROUBLESHOOTING.md - Read FAQ:
HELIOSDB_SQLITE_FAQ.md
The architecture is designed to be invisible when you don’t need it, and powerful when you do.
Happy building with HeliosDB Nano!