Skip to content

HeliosDB Nano SQLite Layer Architecture - User Guide

HeliosDB Nano SQLite Layer Architecture - User Guide

A user-friendly explanation of how HeliosDB Nano works under the hood, designed for developers who want to understand the system without diving into the source code.


Overview: What Makes HeliosDB Nano Different?

HeliosDB Nano is built from the ground up in Rust for performance, safety, and advanced features - but it’s designed to look and feel exactly like SQLite from a Python perspective.

The magic: HeliosDB Nano implements the Python DB-API 2.0 interface (same as SQLite) while using a completely different engine underneath.

┌─────────────────────────────────────────────────────────────┐
│ Your Python Application │
│ import heliosdb_sqlite as sqlite3 │
└──────────────────────────────┬──────────────────────────────┘
│ Python DB-API 2.0
│ (Same interface as SQLite)
┌──────────────────────────────▼──────────────────────────────┐
│ HeliosDB Nano SQLite Compatibility Layer │
│ (Python bindings) │
└──────────────────────────────┬──────────────────────────────┘
│ PyO3 (Python ↔ Rust bridge)
┌──────────────────────────────▼──────────────────────────────┐
│ HeliosDB Nano Core Engine │
│ (Written in Rust) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ SQL Parser & │ │ Query Exec │ │ Storage │ │
│ │ Planner │ │ Engine │ │ Engine │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Concurrency │ │ Vector Search│ │ Time-Travel │ │
│ │ Control │ │ (HNSW) │ │ (MVCC) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────┘

How HeliosDB Nano SQLite Layer Works

When You Import

import heliosdb_sqlite as sqlite3

What happens:

  1. Python loads the HeliosDB Nano module (compiled Rust library with Python bindings)
  2. The module registers itself as a SQLite-compatible database driver
  3. All standard DB-API 2.0 methods (connect, execute, etc.) become available
  4. No configuration required - it just works

Behind the scenes:

  • The Python module is a thin wrapper around the Rust core
  • PyO3 (Python-Rust bridge) handles type conversions automatically
  • Function calls are fast (near-native performance)

When You Connect to a Database

conn = sqlite3.connect('myapp.db')

What happens:

  1. File detection:

    • HeliosDB Nano checks if the file exists
    • If it’s an existing SQLite database, it can read it directly
    • If it’s a new file, it creates a HeliosDB Nano database
  2. Engine initialization:

    • Opens or creates the database file
    • Initializes the storage engine
    • Sets up transaction management
    • Prepares connection handle
  3. Connection object returned:

    • You get a Python connection object
    • It implements all standard SQLite connection methods
    • Plus optional HeliosDB Nano advanced features

File format:

  • HeliosDB Nano uses a compatible format for basic data
  • SQLite can read HeliosDB Nano databases (for basic tables)
  • HeliosDB Nano can read SQLite databases
  • Advanced features (vectors, branches) stored in extended sections

When You Execute a Query

cursor = conn.cursor()
cursor.execute('SELECT * FROM users WHERE age > ?', (18,))
results = cursor.fetchall()

What happens:

1. Python → Rust Bridge
┌─────────────────────────────────────────────────────┐
│ cursor.execute(sql, params) │
│ ↓ │
│ PyO3 converts Python types to Rust types │
│ ↓ │
│ Rust receives: &str (SQL) + Vec<Value> (params) │
└─────────────────────────────────────────────────────┘
2. SQL Parsing
┌─────────────────────────────────────────────────────┐
│ Parse SQL into Abstract Syntax Tree (AST) │
│ Validate syntax │
│ Extract table references, columns, conditions │
└─────────────────────────────────────────────────────┘
3. Query Planning
┌─────────────────────────────────────────────────────┐
│ Analyze query and data statistics │
│ Choose optimal execution strategy │
│ Determine index usage │
│ Optimize joins and filters │
└─────────────────────────────────────────────────────┘
4. Execution
┌─────────────────────────────────────────────────────┐
│ Access storage engine for table data │
│ Apply filters (WHERE age > 18) │
│ Use indexes if available │
│ Collect matching rows │
└─────────────────────────────────────────────────────┘
5. Result Conversion
┌─────────────────────────────────────────────────────┐
│ Convert Rust data structures to Python types │
│ List of tuples/rows returned to Python │
│ Cursor stores results for fetchall()/fetchone() │
└─────────────────────────────────────────────────────┘

Performance optimization:

  • Prepared statement caching
  • Query plan reuse
  • Zero-copy operations where possible
  • Parallel execution for complex queries

The Three Modes Explained

HeliosDB Nano can run in three different modes, giving you flexibility in how you deploy and use it.

Mode 1: Embedded Mode (Default)

How it works:

conn = sqlite3.connect('app.db') # Default mode
┌──────────────────────────────────────────────┐
│ Your Python Application │
│ │
│ ┌────────────────────────────────────┐ │
│ │ HeliosDB Nano Engine │ │
│ │ (runs in same process) │ │
│ └────────────────────────────────────┘ │
│ ↓ │
│ Database File │
│ (app.db) │
└──────────────────────────────────────────────┘

Characteristics:

  • Database engine runs inside your application process
  • Direct memory access (fastest)
  • No network overhead
  • Similar to how SQLite works
  • Perfect for single-application deployments

Use when:

  • Single application accessing database
  • Maximum performance needed
  • Embedded/desktop applications
  • Simple deployments

Mode 2: Server Mode (PostgreSQL-Compatible)

How it works:

server = sqlite3.start_server('app.db', port=5432)
┌─────────────────┐ ┌─────────────────┐
│ Application 1 │ │ Application 2 │
│ (Python) │ │ (Node.js) │
└────────┬────────┘ └────────┬────────┘
│ │
│ PostgreSQL Protocol │
│ │
└────────────┬──────────────┘
┌────────────▼─────────────┐
│ HeliosDB Nano Server │
│ (Network daemon) │
│ │
│ ┌──────────────┐ │
│ │ Query Router │ │
│ └──────┬───────┘ │
│ │ │
│ ┌──────▼────────┐ │
│ │ Core Engine │ │
│ └──────┬────────┘ │
└───────────┼──────────────┘
┌──────▼──────┐
│ Database │
│ (app.db) │
└─────────────┘

Characteristics:

  • Database runs as separate server process
  • Multiple clients can connect
  • PostgreSQL wire protocol (use psycopg2, pg, etc.)
  • Network overhead, but enables multi-client access
  • Horizontal scaling possible

Use when:

  • Multiple applications need database access
  • Microservices architecture
  • Remote database access needed
  • Team development (shared database)

Mode 3: Hybrid Mode (Best of Both Worlds)

How it works:

conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')
┌─────────────────────────────────────────────┐
│ Your Python Application │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ Direct Embedded Access (Fast) │ │
│ │ ↓ │ │
│ │ HeliosDB Nano Core Engine │ │
│ └──────────────┬───────────────────────┘ │
│ │ │
└─────────────────┼───────────────────────────┘
┌────────▼─────────┐
│ │
│ Database File │
│ │
└────────▲─────────┘
┌─────────────────┼───────────────────────────┐
│ Server Thread │ │
│ ↓ │ │
│ PostgreSQL Protocol Server │
│ (Accepts remote connections) │
└────────────────────────────────────────────┘
│ PostgreSQL Protocol
┌────────┴─────────┐
│ Remote Client │
│ (Dashboard, │
│ Monitoring, │
│ Admin tools) │
└──────────────────┘

Characteristics:

  • Your application gets direct embedded access (fast!)
  • Server thread runs in background for remote access
  • Same database, different access methods
  • No performance penalty for local access
  • Enables remote monitoring/management

Use when:

  • Main application needs fast local access
  • Also need remote admin/monitoring
  • Want flexibility without choosing
  • Development/production parity

How Advanced Features Work

Architecture:

cursor.execute('''
SELECT id, title,
vector_distance(embedding, ?, 'cosine') as similarity
FROM articles
ORDER BY similarity
LIMIT 10
''', (query_embedding,))

What happens:

  1. Vector column stored as special data type (VECTOR(n))
  2. HNSW (Hierarchical Navigable Small World) index created automatically
  3. Query planner recognizes vector_distance function
  4. Uses specialized vector search algorithm (not full table scan)
  5. Results ranked by similarity

Storage:

Regular Table Storage:
┌──────┬────────┬──────────────────────────────────┐
│ id │ title │ content │
├──────┼────────┼──────────────────────────────────┤
│ 1 │ "AI" │ "Machine learning introduction" │
└──────┴────────┴──────────────────────────────────┘
Vector Extension Storage:
┌──────┬──────────────────────────────────────────┐
│ id │ embedding (384-dim vector) │
├──────┼──────────────────────────────────────────┤
│ 1 │ [0.23, 0.45, 0.12, ..., 0.67] │
└──────┴──────────────────────────────────────────┘
HNSW Index (graph structure for fast search)

Performance:

  • Approximate nearest neighbor search: O(log n)
  • Much faster than brute force: O(n)
  • Trade-off: slight accuracy for massive speed

Time-Travel Queries

Architecture:

cursor.execute('''
SELECT * FROM users
AS OF TIMESTAMP '2024-01-01 10:00:00'
''')

How it works:

MVCC (Multi-Version Concurrency Control)
Current Data View:
┌──────┬─────────┬──────────────────────┐
│ id │ name │ version/timestamp │
├──────┼─────────┼──────────────────────┤
│ 1 │ "Alice" │ 2024-06-01 15:00 │
└──────┴─────────┴──────────────────────┘
Historical Versions (stored separately):
┌──────┬─────────┬──────────────────────┐
│ id │ name │ version/timestamp │
├──────┼─────────┼──────────────────────┤
│ 1 │ "A" │ 2024-01-01 09:00 │
│ 1 │ "Al" │ 2024-01-01 10:30 │
│ 1 │ "Alice" │ 2024-06-01 15:00 │ ← Current
└──────┴─────────┴──────────────────────┘
Query with AS OF TIMESTAMP:
- Looks up version <= specified timestamp
- Returns historical state
- No impact on current data

Storage overhead:

  • Only changed values stored (delta compression)
  • Configurable retention period
  • Automatic cleanup of old versions

Database Branching

Architecture:

branch = conn.create_branch('experiment')
branch_conn = sqlite3.connect('app.db', branch='experiment')

How it works:

Main Database:
┌────────────────────────────────────┐
│ Main branch (active) │
│ │
│ ┌──────────────────────────────┐ │
│ │ Table: users │ │
│ │ 1000 rows │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────┘
│ create_branch('experiment')
┌────────────────────────────────────┐
│ Experiment branch (copy-on-write) │
│ │
│ ┌──────────────────────────────┐ │
│ │ Only stores CHANGES │ │
│ │ from main branch │ │
│ │ │ │
│ │ Modified rows: 50 │ │
│ │ Deleted rows: 10 │ │
│ │ New rows: 5 │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────┘
Reading from branch:
1. Check if row modified in branch → use branch version
2. If not modified → use main branch version
3. Apply deletes from branch
4. Add new rows from branch
Result: Full database view without duplicating unchanged data

Benefits:

  • Instant branch creation (no copying)
  • Minimal storage overhead
  • Safe experimentation
  • Easy merge or discard

Data Storage Explanation

File Structure

myapp.db (HeliosDB Nano database file)
├─ Header (metadata)
│ ├─ Format version
│ ├─ Page size
│ ├─ Schema version
│ └─ Feature flags
├─ Schema section
│ ├─ Table definitions
│ ├─ Index definitions
│ ├─ View definitions
│ └─ Constraints
├─ Data pages (table data)
│ ├─ B-tree structure (like SQLite)
│ ├─ Row storage
│ └─ Index structures
├─ Advanced features section (optional)
│ ├─ Vector indexes (HNSW graphs)
│ ├─ Time-travel versions (MVCC)
│ ├─ Branch metadata
│ └─ Encryption metadata
└─ Transaction log (WAL)
├─ Uncommitted changes
└─ Recovery information

Compatibility:

  • Basic sections compatible with SQLite
  • Advanced sections ignored by SQLite (safe)
  • HeliosDB Nano reads both formats seamlessly

Performance Characteristics

Memory Usage

Per Connection:
- Base overhead: ~100 KB
- Query cache: ~10 MB (configurable)
- Connection buffers: ~1 MB
Per Table:
- Schema metadata: ~1-10 KB
- Index overhead: Varies (5-20% of table size)
Vector Search:
- HNSW index: ~20-40% of vector data size
- In-memory: Optional, speeds up queries 10-100x
Time-Travel:
- Historical versions: Configurable retention
- Delta compression reduces overhead to ~5-15%

Query Performance

Operation | SQLite | HeliosDB Nano
----------------------|-------------|---------------
Single row read | 10 μs | 12 μs (comparable)
Indexed search | 100 μs | 105 μs (comparable)
Full table scan | 10 ms | 11 ms (comparable)
Single write | 50 μs | 75 μs (slight overhead)
Concurrent reads | Excellent | Excellent (same)
Concurrent writes | Blocked | Parallel! (major win)
Vector search (10k) | N/A | 2 ms (exclusive)
Time-travel query | N/A | 15 ms (exclusive)

Key takeaway:

  • Similar performance for standard operations
  • Much better with concurrent writes
  • Advanced features add new capabilities (not slower alternatives)

Concurrency Model

How HeliosDB Nano Handles Concurrent Access

SQLite approach (locks):

Thread 1: BEGIN → Write → [Lock database] → COMMIT → [Unlock]
Thread 2: [WAIT...] → BEGIN → Write
Thread 3: [WAIT...] → BEGIN

HeliosDB Nano approach (MVCC):

Thread 1: BEGIN → Write to version 1 → COMMIT
Thread 2: BEGIN → Write to version 2 → COMMIT (simultaneous!)
Thread 3: BEGIN → Read version 1 → done (doesn't block writes)
Conflict resolution:
- Non-conflicting writes: Both succeed
- Conflicting writes: Last commit wins (configurable)
- Readers never block writers
- Writers never block readers

Implementation:

Transaction Log:
Time ─────────────────────────────▶
T1: [BEGIN] ────── [Write row A] ────── [COMMIT v1]
T2: [BEGIN] ────── [Write row B] ────── [COMMIT v2]
T3: [BEGIN] [Read] [COMMIT]
All transactions see consistent snapshot
No locks needed!

What Happens When You Import

Detailed Import Process

import heliosdb_sqlite as sqlite3

Step-by-step:

  1. Python loads .so/.dll (compiled Rust library)

    HeliosDB Nano module ────▶ [OS loads shared library]
  2. Module initialization (happens once)

    // Rust code (simplified)
    #[pymodule]
    fn heliosdb_sqlite(py: Python, m: &PyModule) -> PyResult<()> {
    m.add_class::<Connection>()?;
    m.add_class::<Cursor>()?;
    m.add_function(wrap_pyfunction!(connect, m)?)?;
    // ... register all DB-API 2.0 functions
    Ok(())
    }
  3. Python receives module with all methods

    # Now available:
    sqlite3.connect(...)
    sqlite3.Connection
    sqlite3.Cursor
    sqlite3.Row
    # etc. - full DB-API 2.0 interface
  4. No configuration needed - ready to use!


Threading and Async Support

Thread Safety

HeliosDB Nano is thread-safe by default (unlike SQLite):

# This works safely in HeliosDB Nano
conn = sqlite3.connect('app.db', check_same_thread=False)
def worker(thread_id):
cursor = conn.cursor() # Safe from multiple threads
cursor.execute('INSERT INTO logs VALUES (?, ?)', (thread_id, 'message'))
conn.commit()
threads = [Thread(target=worker, args=(i,)) for i in range(100)]
# All threads can use same connection safely!

How it’s safe:

  • Rust’s ownership model prevents data races
  • Internal synchronization (lock-free where possible)
  • MVCC eliminates most contention

Async/Await Support

import asyncio
import heliosdb_sqlite as sqlite3
async def async_query():
loop = asyncio.get_event_loop()
# Run blocking operation in thread pool
def blocking_query():
conn = sqlite3.connect('app.db')
cursor = conn.cursor()
cursor.execute('SELECT * FROM users')
return cursor.fetchall()
results = await loop.run_in_executor(None, blocking_query)
return results
# Or use async connection pool
from heliosdb_sqlite.asyncio import AsyncConnectionPool
pool = AsyncConnectionPool('app.db')
async with pool.connection() as conn:
cursor = conn.cursor()
results = await cursor.execute('SELECT * FROM users')

Summary: Why This Architecture Matters

For Drop-In Compatibility:

  • Implements exact same Python interface as SQLite
  • You change one import line, everything works
  • Existing code, queries, databases all compatible

For Performance:

  • Rust core = fast and safe
  • MVCC = no write locks
  • Parallel execution where possible
  • Optimized for modern workloads

For Advanced Features:

  • Vector search: Built-in HNSW indexes
  • Time-travel: MVCC enables historical queries
  • Branching: Copy-on-write for safe experimentation
  • Encryption: Transparent at storage layer

For Flexibility:

  • Three modes: embedded, server, hybrid
  • Start simple, scale when needed
  • No architectural changes required

What’s Next?

Now that you understand how HeliosDB Nano works:

  1. Try it out: HELIOSDB_SQLITE_DROP_IN_GUIDE.md
  2. Migrate your app: HELIOSDB_SQLITE_MIGRATION_PATTERNS.md
  3. Explore advanced features: HELIOSDB_SQLITE_ADVANCED_FEATURES.md
  4. Troubleshoot issues: HELIOSDB_SQLITE_TROUBLESHOOTING.md
  5. Read FAQ: HELIOSDB_SQLITE_FAQ.md

The architecture is designed to be invisible when you don’t need it, and powerful when you do.

Happy building with HeliosDB Nano!