HeliosDB Nano SQLite Advanced Features Guide

Once you’re comfortable with basic drop-in SQLite compatibility, HeliosDB Nano offers powerful advanced features that go far beyond traditional SQLite capabilities.

Vector Search from Python
Database Branching
Time-Travel Queries
Encryption at Rest
Server Mode Switching
Hybrid Mode Usage
Performance Tuning
Gradual Migration to Full Features

Vector Search from Python

HeliosDB Nano includes native vector search capabilities for semantic similarity, embeddings, and AI applications.

Basic Vector Search Setup

import heliosdb_sqlite as sqlite3
import numpy as np

# Connect with vector search enabled
conn = sqlite3.connect('vectors.db', enable_vector_search=True)
cursor = conn.cursor()

# Create table with vector column
cursor.execute('''
    CREATE TABLE documents (
        id INTEGER PRIMARY KEY,
        title TEXT,
        content TEXT,
        embedding VECTOR(384)  -- 384-dimensional vector
    )
''')

# Insert document with embedding
doc_embedding = np.random.rand(384).tolist()
cursor.execute('''
    INSERT INTO documents (title, content, embedding)
    VALUES (?, ?, ?)
''', ('Introduction to AI', 'Machine learning basics...', doc_embedding))
conn.commit()

Semantic Search Queries

# Generate query embedding
query_text = "artificial intelligence fundamentals"
query_embedding = generate_embedding(query_text)  # Your embedding model

# Find similar documents using cosine similarity
cursor.execute('''
    SELECT id, title, content,
           vector_distance(embedding, ?, 'cosine') as similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT 10
''', (query_embedding,))

results = cursor.fetchall()
for doc_id, title, content, similarity in results:
    print(f"{title}: {similarity:.4f}")

Advanced Vector Operations

# Different distance metrics
cursor.execute('''
    SELECT title,
           vector_distance(embedding, ?, 'cosine') as cosine_dist,
           vector_distance(embedding, ?, 'euclidean') as euclidean_dist,
           vector_distance(embedding, ?, 'manhattan') as manhattan_dist
    FROM documents
''', (query_embedding, query_embedding, query_embedding))

# Vector arithmetic (e.g., "king - man + woman = queen")
cursor.execute('''
    SELECT title,
           vector_distance(
               embedding,
               vector_add(
                   vector_subtract(?, ?),
                   ?
               ),
               'cosine'
           ) as similarity
    FROM documents
    ORDER BY similarity ASC
    LIMIT 5
''', (king_embedding, man_embedding, woman_embedding))

# Batch vector search
query_embeddings = [emb1, emb2, emb3, ...]
cursor.executemany('''
    SELECT id, title FROM documents
    WHERE vector_distance(embedding, ?, 'cosine') < 0.3
''', [(emb,) for emb in query_embeddings])

Real-World Example: Semantic Document Search

import heliosdb_sqlite as sqlite3
from sentence_transformers import SentenceTransformer

# Initialize embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Connect to database
conn = sqlite3.connect('knowledge_base.db', enable_vector_search=True)
cursor = conn.cursor()

# Create schema
cursor.execute('''
    CREATE TABLE IF NOT EXISTS articles (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        title TEXT NOT NULL,
        content TEXT NOT NULL,
        category TEXT,
        embedding VECTOR(384),
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )
''')

def add_article(title, content, category):
    """Add article with automatic embedding generation"""
    embedding = model.encode(content).tolist()
    cursor.execute('''
        INSERT INTO articles (title, content, category, embedding)
        VALUES (?, ?, ?, ?)
    ''', (title, content, category, embedding))
    conn.commit()
    return cursor.lastrowid

def semantic_search(query, limit=10, category=None):
    """Semantic search with optional category filter"""
    query_embedding = model.encode(query).tolist()

    if category:
        cursor.execute('''
            SELECT id, title, content, category,
                   vector_distance(embedding, ?, 'cosine') as similarity
            FROM articles
            WHERE category = ?
            ORDER BY similarity ASC
            LIMIT ?
        ''', (query_embedding, category, limit))
    else:
        cursor.execute('''
            SELECT id, title, content, category,
                   vector_distance(embedding, ?, 'cosine') as similarity
            FROM articles
            ORDER BY similarity ASC
            LIMIT ?
        ''', (query_embedding, limit))

    return cursor.fetchall()

# Usage
add_article(
    'Introduction to Neural Networks',
    'Neural networks are computing systems inspired by biological neural networks...',
    'AI'
)

results = semantic_search('deep learning basics', limit=5)
for article_id, title, content, category, similarity in results:
    print(f"[{category}] {title} (similarity: {similarity:.4f})")

Database Branching

Create experimental branches of your database for testing without affecting production data.

Creating and Using Branches

import heliosdb_sqlite as sqlite3

conn = sqlite3.connect('production.db')

# Create a branch for experimentation
branch = conn.create_branch('experiment_2024')
print(f"Created branch: {branch.name}")

# Work on the branch
branch_conn = sqlite3.connect('production.db', branch='experiment_2024')
cursor = branch_conn.cursor()

# Make changes on the branch (doesn't affect main database)
cursor.execute('DELETE FROM users WHERE inactive = 1')
cursor.execute('UPDATE products SET price = price * 1.1')
branch_conn.commit()

# Test the changes
cursor.execute('SELECT COUNT(*) FROM users')
print(f"Users after cleanup: {cursor.fetchone()[0]}")

# If satisfied, merge back to main
if results_look_good():
    conn.merge_branch('experiment_2024')
    print("Branch merged successfully!")
else:
    conn.delete_branch('experiment_2024')
    print("Branch discarded")

branch_conn.close()
conn.close()

Use Cases for Branching

# Use Case 1: A/B Testing
def ab_test_query_performance():
    conn = sqlite3.connect('analytics.db')

    # Create branch A: with new index
    branch_a = conn.create_branch('with_index')
    conn_a = sqlite3.connect('analytics.db', branch='with_index')
    conn_a.execute('CREATE INDEX idx_user_date ON events(user_id, date)')

    # Create branch B: without index
    branch_b = conn.create_branch('without_index')
    conn_b = sqlite3.connect('analytics.db', branch='without_index')

    # Test query performance on both
    import time

    start = time.time()
    conn_a.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
    time_a = time.time() - start

    start = time.time()
    conn_b.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01'))
    time_b = time.time() - start

    print(f"With index: {time_a:.4f}s")
    print(f"Without index: {time_b:.4f}s")

    # Merge the faster version
    if time_a < time_b:
        conn.merge_branch('with_index')
        print("Merged optimized version")

    conn_a.close()
    conn_b.close()
    conn.close()

# Use Case 2: Safe Schema Migrations
def safe_migration():
    conn = sqlite3.connect('app.db')

    # Create migration branch
    migration = conn.create_branch('schema_v2')
    conn_migration = sqlite3.connect('app.db', branch='schema_v2')

    try:
        # Attempt migration
        conn_migration.execute('ALTER TABLE users ADD COLUMN preferences JSON')
        conn_migration.execute('UPDATE users SET preferences = "{}"')
        conn_migration.commit()

        # Run tests on migrated schema
        run_integration_tests(conn_migration)

        # If tests pass, merge
        conn.merge_branch('schema_v2')
        print("Migration successful!")

    except Exception as e:
        print(f"Migration failed: {e}")
        conn.delete_branch('schema_v2')
        print("Rolled back to previous schema")

    conn_migration.close()
    conn.close()

Listing and Managing Branches

conn = sqlite3.connect('app.db')

# List all branches
branches = conn.list_branches()
for branch in branches:
    print(f"Branch: {branch.name}, Created: {branch.created_at}, Size: {branch.size_mb}MB")

# Get branch information
branch_info = conn.get_branch_info('experiment_2024')
print(f"Commits: {branch_info.commits}")
print(f"Diverged from main: {branch_info.divergence_point}")

# Compare branches
diff = conn.compare_branches('main', 'experiment_2024')
print(f"Tables changed: {diff.tables_modified}")
print(f"Rows changed: {diff.total_row_changes}")

# Clean up old branches
for branch in branches:
    if branch.age_days > 30 and not branch.is_active:
        conn.delete_branch(branch.name)
        print(f"Deleted old branch: {branch.name}")

Time-Travel Queries

Query historical data states without maintaining separate backup databases.

Basic Time-Travel Queries

import heliosdb_sqlite as sqlite3
from datetime import datetime, timedelta

conn = sqlite3.connect('app.db', enable_time_travel=True)
cursor = conn.cursor()

# Query current state
cursor.execute('SELECT * FROM users WHERE id = 123')
current_user = cursor.fetchone()
print(f"Current: {current_user}")

# Query state from 1 hour ago
one_hour_ago = datetime.now() - timedelta(hours=1)
cursor.execute(f'''
    SELECT * FROM users
    WHERE id = 123
    AS OF TIMESTAMP '{one_hour_ago.isoformat()}'
''')
past_user = cursor.fetchone()
print(f"1 hour ago: {past_user}")

# Query state from specific date
cursor.execute('''
    SELECT * FROM products
    AS OF TIMESTAMP '2024-01-01 00:00:00'
''')
products_jan1 = cursor.fetchall()
print(f"Products on Jan 1: {len(products_jan1)}")

Tracking Changes Over Time

# Find when a record changed
cursor.execute('''
    SELECT * FROM users
    WHERE id = 123
    FOR SYSTEM_TIME BETWEEN
        TIMESTAMP '2024-01-01 00:00:00'
        AND TIMESTAMP '2024-01-31 23:59:59'
''')

history = cursor.fetchall()
print(f"User had {len(history)} changes in January")

for timestamp, user_data in history:
    print(f"{timestamp}: {user_data}")

Audit and Compliance

def audit_user_changes(user_id, start_date, end_date):
    """Generate audit report of all changes to a user"""
    conn = sqlite3.connect('secure.db', enable_time_travel=True)
    cursor = conn.cursor()

    cursor.execute('''
        SELECT system_time, id, name, email, role, modified_by
        FROM users
        WHERE id = ?
        FOR SYSTEM_TIME BETWEEN ? AND ?
        ORDER BY system_time
    ''', (user_id, start_date, end_date))

    changes = cursor.fetchall()

    print(f"Audit Report for User {user_id}")
    print(f"Period: {start_date} to {end_date}")
    print(f"Total changes: {len(changes)}\n")

    for i, (timestamp, uid, name, email, role, modified_by) in enumerate(changes):
        if i > 0:
            print(f"\n--- Change {i} at {timestamp} by {modified_by} ---")
            prev = changes[i-1]
            if name != prev[2]:
                print(f"Name: {prev[2]} → {name}")
            if email != prev[3]:
                print(f"Email: {prev[3]} → {email}")
            if role != prev[4]:
                print(f"Role: {prev[4]} → {role}")
        else:
            print(f"Initial state at {timestamp}: {name}, {email}, {role}")

    conn.close()

# Usage
audit_user_changes(123, '2024-01-01', '2024-12-31')

Rollback to Previous State

def rollback_table_to_timestamp(table_name, target_timestamp):
    """Restore table to a previous state"""
    conn = sqlite3.connect('app.db', enable_time_travel=True)
    cursor = conn.cursor()

    # Get historical data
    cursor.execute(f'''
        SELECT * FROM {table_name}
        AS OF TIMESTAMP '{target_timestamp}'
    ''')
    historical_data = cursor.fetchall()

    # Get column names
    cursor.execute(f'PRAGMA table_info({table_name})')
    columns = [col[1] for col in cursor.fetchall()]

    # Clear current data
    cursor.execute(f'DELETE FROM {table_name}')

    # Restore historical data
    placeholders = ','.join(['?' for _ in columns])
    cursor.executemany(
        f'INSERT INTO {table_name} VALUES ({placeholders})',
        historical_data
    )

    conn.commit()
    print(f"Restored {table_name} to state at {target_timestamp}")
    print(f"Restored {len(historical_data)} rows")

    conn.close()

# Usage: rollback accidental deletion
rollback_table_to_timestamp('users', '2024-06-01 10:00:00')

Encryption at Rest

Protect sensitive data with transparent database encryption.

Enabling Encryption

import heliosdb_sqlite as sqlite3
import os

# Generate or retrieve encryption key (keep this secure!)
encryption_key = os.getenv('DB_ENCRYPTION_KEY')

# Create encrypted database
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()

# Use normally - encryption is transparent
cursor.execute('CREATE TABLE secrets (id INTEGER, data TEXT)')
cursor.execute('INSERT INTO secrets VALUES (1, "sensitive information")')
conn.commit()

# Data is encrypted on disk
conn.close()

# To read, must provide the same key
conn = sqlite3.connect('secure.db', encryption_key=encryption_key)
cursor = conn.cursor()
cursor.execute('SELECT * FROM secrets')
print(cursor.fetchall())  # Decrypted automatically

Key Rotation

def rotate_encryption_key(db_path, old_key, new_key):
    """Safely rotate database encryption key"""
    import tempfile
    import shutil

    # Open with old key
    conn_old = sqlite3.connect(db_path, encryption_key=old_key)

    # Create temporary database with new key
    temp_db = tempfile.NamedTemporaryFile(delete=False, suffix='.db')
    conn_new = sqlite3.connect(temp_db.name, encryption_key=new_key)

    # Copy all data
    for line in conn_old.iterdump():
        conn_new.execute(line)
    conn_new.commit()

    conn_old.close()
    conn_new.close()

    # Replace old database
    shutil.move(temp_db.name, db_path)
    print("Encryption key rotated successfully")

# Usage
old_key = os.getenv('OLD_DB_KEY')
new_key = os.getenv('NEW_DB_KEY')
rotate_encryption_key('secure.db', old_key, new_key)

Encrypted Columns

# Encrypt specific columns while keeping others searchable
conn = sqlite3.connect('mixed.db', encryption_key=encryption_key)
cursor = conn.cursor()

cursor.execute('''
    CREATE TABLE users (
        id INTEGER PRIMARY KEY,
        username TEXT,  -- Not encrypted, searchable
        email TEXT ENCRYPTED,  -- Encrypted column
        password_hash TEXT ENCRYPTED
    )
''')

# Insert with automatic column encryption
cursor.execute('''
    INSERT INTO users (username, email, password_hash)
    VALUES (?, ?, ?)
''', ('alice', 'alice@example.com', 'hashed_password'))

# Search on non-encrypted columns (fast)
cursor.execute('SELECT * FROM users WHERE username = ?', ('alice',))

# Encrypted columns are decrypted automatically on read
user = cursor.fetchone()
print(f"Email: {user[2]}")  # Decrypted

Server Mode Switching

Run HeliosDB Nano as a PostgreSQL-compatible server for multi-client access.

Starting Server Mode

import heliosdb_sqlite as sqlite3

# Start database in server mode
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=100
)

print(f"HeliosDB Nano server listening on port {server.port}")
print(f"Connect using: postgresql://localhost:5432/app")

# Server runs in background, application continues
# Clients can connect using standard PostgreSQL drivers

# Stop server when done
# server.stop()

Connecting as Client

# From another Python script or machine
import psycopg2  # Standard PostgreSQL driver

# Connect to HeliosDB Nano server
conn = psycopg2.connect(
    host='localhost',
    port=5432,
    database='app',
    user='helios',
    password=''
)

cursor = conn.cursor()
cursor.execute('SELECT * FROM users')
users = cursor.fetchall()
print(users)

conn.close()

Server Configuration

# Advanced server configuration
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=200,
    connection_timeout=30,
    enable_ssl=True,
    ssl_cert='server.crt',
    ssl_key='server.key',
    auth_method='password',  # or 'certificate'
    allowed_ips=['192.168.1.0/24', '10.0.0.0/8']
)

# Monitor server status
status = server.get_status()
print(f"Active connections: {status.active_connections}")
print(f"Total queries: {status.total_queries}")
print(f"Uptime: {status.uptime_seconds}s")

Hybrid Mode Usage

Combine embedded and server modes for flexible architecture.

Hybrid Mode Setup

import heliosdb_sqlite as sqlite3

# Start in hybrid mode
conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')

# Use as embedded database (fast, local)
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) FROM users')
local_count = cursor.fetchone()[0]
print(f"Local access: {local_count} users")

# Also accessible via network (for other clients)
# Server automatically started in background
server_info = conn.get_server_info()
print(f"Also available at: {server_info.url}")

# Other applications can connect remotely
# while this application uses local access

Use Cases for Hybrid Mode

# Use Case 1: Web application with admin tools
# Web app uses local embedded access for speed
import heliosdb_sqlite as sqlite3
from flask import Flask

app = Flask(__name__)
db_conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')

@app.route('/api/users')
def get_users():
    # Fast local access
    cursor = db_conn.cursor()
    cursor.execute('SELECT * FROM users')
    return jsonify(cursor.fetchall())

# Meanwhile, admin can connect remotely for management:
# $ psql -h localhost -p 5432 -U helios app

# Use Case 2: Local ML training with remote monitoring
import heliosdb_sqlite as sqlite3

# ML training script uses local access
conn = sqlite3.connect('training.db', heliosdb_mode='hybrid')

def train_model():
    cursor = conn.cursor()
    for epoch in range(100):
        # Fast local writes
        cursor.execute('INSERT INTO training_log VALUES (?, ?, ?)',
                      (epoch, loss, accuracy))
        conn.commit()

# Monitoring dashboard connects remotely
# to visualize training progress in real-time

Performance Tuning for HeliosDB Nano

Optimize HeliosDB Nano for your specific workload.

Connection Pooling

from heliosdb_sqlite import connection_pool

# Create connection pool
pool = connection_pool.create(
    database='app.db',
    pool_size=20,
    max_overflow=10,
    timeout=30,
    recycle=3600  # Recycle connections after 1 hour
)

# Use connections from pool
with pool.connection() as conn:
    cursor = conn.cursor()
    cursor.execute('SELECT * FROM users')
    users = cursor.fetchall()

# Connection automatically returned to pool

Query Optimization

conn = sqlite3.connect('app.db')

# Enable query plan analysis
conn.enable_query_analyzer()

# Run your queries
cursor = conn.cursor()
cursor.execute('SELECT * FROM users WHERE created_at > ?', ('2024-01-01',))

# Get optimization suggestions
suggestions = conn.get_query_suggestions()
for suggestion in suggestions:
    print(f"Query: {suggestion.query}")
    print(f"Current plan: {suggestion.current_plan}")
    print(f"Suggested index: {suggestion.suggested_index}")
    print(f"Expected speedup: {suggestion.speedup_factor}x")

Batch Operations

# Efficient batch inserts
conn = sqlite3.connect('app.db')
cursor = conn.cursor()

# Use batch_insert for better performance
users = [
    ('user1', 'email1@example.com'),
    ('user2', 'email2@example.com'),
    # ... thousands of users
]

# Traditional way (slower)
# for user in users:
#     cursor.execute('INSERT INTO users VALUES (?, ?)', user)

# Optimized way (faster)
cursor.batch_insert('users', ['name', 'email'], users)
conn.commit()

print(f"Inserted {len(users)} users efficiently")

Caching Strategies

# Enable query result caching
conn = sqlite3.connect('app.db', enable_cache=True, cache_size_mb=100)

# First query: hits database
cursor = conn.cursor()
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products1 = cursor.fetchall()  # Slow (first time)

# Same query: served from cache
cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))
products2 = cursor.fetchall()  # Fast (cached)

# Cache statistics
stats = conn.get_cache_stats()
print(f"Cache hit rate: {stats.hit_rate:.2%}")
print(f"Queries served from cache: {stats.cache_hits}")

Gradual Migration to Full HeliosDB Nano Features

Incrementally adopt advanced features as your application grows.

Phase 1: Drop-In Replacement

# Start simple: just replace import
import heliosdb_sqlite as sqlite3

# Everything else unchanged
conn = sqlite3.connect('app.db')
# ... existing code ...

Phase 2: Enable Better Concurrency

# No code changes needed!
# HeliosDB Nano automatically handles concurrent writes better
# Your existing multi-threaded/async code just works better

Phase 3: Add Vector Search

# Add vector search to one feature
conn = sqlite3.connect('app.db', enable_vector_search=True)

# Add vector column to existing table
cursor = conn.cursor()
cursor.execute('ALTER TABLE articles ADD COLUMN embedding VECTOR(384)')

# Populate embeddings for existing data
cursor.execute('SELECT id, content FROM articles')
for article_id, content in cursor.fetchall():
    embedding = generate_embedding(content)
    cursor.execute('UPDATE articles SET embedding = ? WHERE id = ?',
                  (embedding, article_id))
conn.commit()

# Now you have semantic search!

Phase 4: Enable Time-Travel for Audit

# Enable time-travel on sensitive tables
conn = sqlite3.connect('app.db', enable_time_travel=True)

# Add system versioning to audit-critical tables
cursor.execute('''
    ALTER TABLE financial_transactions
    ADD SYSTEM VERSIONING
''')

# Now you have automatic audit trail

Phase 5: Create Development Branches

# Use branching for safe testing
conn = sqlite3.connect('app.db')

# Before risky operation, create branch
test_branch = conn.create_branch('test_migration')
test_conn = sqlite3.connect('app.db', branch='test_migration')

# Test your changes
test_conn.execute('ALTER TABLE users ADD COLUMN new_field TEXT')

# If successful, merge; if not, discard
if tests_pass():
    conn.merge_branch('test_migration')
else:
    conn.delete_branch('test_migration')

Phase 6: Scale with Server Mode

# When you outgrow single-machine setup
server = sqlite3.start_server(
    database='app.db',
    host='0.0.0.0',
    port=5432,
    max_connections=200
)

# Application can still use local access for speed
# But now supports remote clients too

Summary

HeliosDB Nano advanced features are opt-in and gradual:

Start with drop-in SQLite compatibility
Add features as needed:
- Vector search for AI/ML applications
- Time-travel for audit and compliance
- Branching for safe experimentation
- Encryption for security
- Server mode for scale
- Hybrid mode for flexibility
No need to adopt everything at once
Each feature enhances, doesn’t replace, basic functionality
All features work together seamlessly

Remember: You can stay on basic SQLite compatibility forever if that meets your needs. Advanced features are there when you’re ready.

For more information:

Drop-in guide: HELIOSDB_SQLITE_DROP_IN_GUIDE.md
Migration patterns: HELIOSDB_SQLITE_MIGRATION_PATTERNS.md
FAQ: HELIOSDB_SQLITE_FAQ.md

HeliosDB Nano SQLite Advanced Features Guide

HeliosDB Nano SQLite Advanced Features Guide

Table of Contents

Vector Search from Python

Basic Vector Search Setup

Semantic Search Queries

Advanced Vector Operations

Real-World Example: Semantic Document Search

Database Branching

Creating and Using Branches

Use Cases for Branching

Listing and Managing Branches

Time-Travel Queries

Basic Time-Travel Queries

Tracking Changes Over Time

Audit and Compliance

Rollback to Previous State

Encryption at Rest

Enabling Encryption

Key Rotation

Encrypted Columns

Server Mode Switching

Starting Server Mode

Connecting as Client

Server Configuration

Hybrid Mode Usage

Hybrid Mode Setup

Use Cases for Hybrid Mode

Performance Tuning for HeliosDB Nano

Connection Pooling

Query Optimization

Batch Operations

Caching Strategies

Gradual Migration to Full HeliosDB Nano Features

Phase 1: Drop-In Replacement

Phase 2: Enable Better Concurrency

Phase 3: Add Vector Search

Phase 4: Enable Time-Travel for Audit

Phase 5: Create Development Branches

Phase 6: Scale with Server Mode

Summary