HeliosDB Nano SQLite Advanced Features Guide
HeliosDB Nano SQLite Advanced Features Guide
Once you’re comfortable with basic drop-in SQLite compatibility, HeliosDB Nano offers powerful advanced features that go far beyond traditional SQLite capabilities.
Table of Contents
- Vector Search from Python
- Database Branching
- Time-Travel Queries
- Encryption at Rest
- Server Mode Switching
- Hybrid Mode Usage
- Performance Tuning
- Gradual Migration to Full Features
Vector Search from Python
HeliosDB Nano includes native vector search capabilities for semantic similarity, embeddings, and AI applications.
Basic Vector Search Setup
import heliosdb_sqlite as sqlite3import numpy as np
# Connect with vector search enabledconn = sqlite3.connect('vectors.db', enable_vector_search=True)cursor = conn.cursor()
# Create table with vector columncursor.execute(''' CREATE TABLE documents ( id INTEGER PRIMARY KEY, title TEXT, content TEXT, embedding VECTOR(384) -- 384-dimensional vector )''')
# Insert document with embeddingdoc_embedding = np.random.rand(384).tolist()cursor.execute(''' INSERT INTO documents (title, content, embedding) VALUES (?, ?, ?)''', ('Introduction to AI', 'Machine learning basics...', doc_embedding))conn.commit()Semantic Search Queries
# Generate query embeddingquery_text = "artificial intelligence fundamentals"query_embedding = generate_embedding(query_text) # Your embedding model
# Find similar documents using cosine similaritycursor.execute(''' SELECT id, title, content, vector_distance(embedding, ?, 'cosine') as similarity FROM documents ORDER BY similarity ASC LIMIT 10''', (query_embedding,))
results = cursor.fetchall()for doc_id, title, content, similarity in results: print(f"{title}: {similarity:.4f}")Advanced Vector Operations
# Different distance metricscursor.execute(''' SELECT title, vector_distance(embedding, ?, 'cosine') as cosine_dist, vector_distance(embedding, ?, 'euclidean') as euclidean_dist, vector_distance(embedding, ?, 'manhattan') as manhattan_dist FROM documents''', (query_embedding, query_embedding, query_embedding))
# Vector arithmetic (e.g., "king - man + woman = queen")cursor.execute(''' SELECT title, vector_distance( embedding, vector_add( vector_subtract(?, ?), ? ), 'cosine' ) as similarity FROM documents ORDER BY similarity ASC LIMIT 5''', (king_embedding, man_embedding, woman_embedding))
# Batch vector searchquery_embeddings = [emb1, emb2, emb3, ...]cursor.executemany(''' SELECT id, title FROM documents WHERE vector_distance(embedding, ?, 'cosine') < 0.3''', [(emb,) for emb in query_embeddings])Real-World Example: Semantic Document Search
import heliosdb_sqlite as sqlite3from sentence_transformers import SentenceTransformer
# Initialize embedding modelmodel = SentenceTransformer('all-MiniLM-L6-v2')
# Connect to databaseconn = sqlite3.connect('knowledge_base.db', enable_vector_search=True)cursor = conn.cursor()
# Create schemacursor.execute(''' CREATE TABLE IF NOT EXISTS articles ( id INTEGER PRIMARY KEY AUTOINCREMENT, title TEXT NOT NULL, content TEXT NOT NULL, category TEXT, embedding VECTOR(384), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP )''')
def add_article(title, content, category): """Add article with automatic embedding generation""" embedding = model.encode(content).tolist() cursor.execute(''' INSERT INTO articles (title, content, category, embedding) VALUES (?, ?, ?, ?) ''', (title, content, category, embedding)) conn.commit() return cursor.lastrowid
def semantic_search(query, limit=10, category=None): """Semantic search with optional category filter""" query_embedding = model.encode(query).tolist()
if category: cursor.execute(''' SELECT id, title, content, category, vector_distance(embedding, ?, 'cosine') as similarity FROM articles WHERE category = ? ORDER BY similarity ASC LIMIT ? ''', (query_embedding, category, limit)) else: cursor.execute(''' SELECT id, title, content, category, vector_distance(embedding, ?, 'cosine') as similarity FROM articles ORDER BY similarity ASC LIMIT ? ''', (query_embedding, limit))
return cursor.fetchall()
# Usageadd_article( 'Introduction to Neural Networks', 'Neural networks are computing systems inspired by biological neural networks...', 'AI')
results = semantic_search('deep learning basics', limit=5)for article_id, title, content, category, similarity in results: print(f"[{category}] {title} (similarity: {similarity:.4f})")Database Branching
Create experimental branches of your database for testing without affecting production data.
Creating and Using Branches
import heliosdb_sqlite as sqlite3
conn = sqlite3.connect('production.db')
# Create a branch for experimentationbranch = conn.create_branch('experiment_2024')print(f"Created branch: {branch.name}")
# Work on the branchbranch_conn = sqlite3.connect('production.db', branch='experiment_2024')cursor = branch_conn.cursor()
# Make changes on the branch (doesn't affect main database)cursor.execute('DELETE FROM users WHERE inactive = 1')cursor.execute('UPDATE products SET price = price * 1.1')branch_conn.commit()
# Test the changescursor.execute('SELECT COUNT(*) FROM users')print(f"Users after cleanup: {cursor.fetchone()[0]}")
# If satisfied, merge back to mainif results_look_good(): conn.merge_branch('experiment_2024') print("Branch merged successfully!")else: conn.delete_branch('experiment_2024') print("Branch discarded")
branch_conn.close()conn.close()Use Cases for Branching
# Use Case 1: A/B Testingdef ab_test_query_performance(): conn = sqlite3.connect('analytics.db')
# Create branch A: with new index branch_a = conn.create_branch('with_index') conn_a = sqlite3.connect('analytics.db', branch='with_index') conn_a.execute('CREATE INDEX idx_user_date ON events(user_id, date)')
# Create branch B: without index branch_b = conn.create_branch('without_index') conn_b = sqlite3.connect('analytics.db', branch='without_index')
# Test query performance on both import time
start = time.time() conn_a.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01')) time_a = time.time() - start
start = time.time() conn_b.execute('SELECT * FROM events WHERE user_id = ? AND date > ?', (123, '2024-01-01')) time_b = time.time() - start
print(f"With index: {time_a:.4f}s") print(f"Without index: {time_b:.4f}s")
# Merge the faster version if time_a < time_b: conn.merge_branch('with_index') print("Merged optimized version")
conn_a.close() conn_b.close() conn.close()
# Use Case 2: Safe Schema Migrationsdef safe_migration(): conn = sqlite3.connect('app.db')
# Create migration branch migration = conn.create_branch('schema_v2') conn_migration = sqlite3.connect('app.db', branch='schema_v2')
try: # Attempt migration conn_migration.execute('ALTER TABLE users ADD COLUMN preferences JSON') conn_migration.execute('UPDATE users SET preferences = "{}"') conn_migration.commit()
# Run tests on migrated schema run_integration_tests(conn_migration)
# If tests pass, merge conn.merge_branch('schema_v2') print("Migration successful!")
except Exception as e: print(f"Migration failed: {e}") conn.delete_branch('schema_v2') print("Rolled back to previous schema")
conn_migration.close() conn.close()Listing and Managing Branches
conn = sqlite3.connect('app.db')
# List all branchesbranches = conn.list_branches()for branch in branches: print(f"Branch: {branch.name}, Created: {branch.created_at}, Size: {branch.size_mb}MB")
# Get branch informationbranch_info = conn.get_branch_info('experiment_2024')print(f"Commits: {branch_info.commits}")print(f"Diverged from main: {branch_info.divergence_point}")
# Compare branchesdiff = conn.compare_branches('main', 'experiment_2024')print(f"Tables changed: {diff.tables_modified}")print(f"Rows changed: {diff.total_row_changes}")
# Clean up old branchesfor branch in branches: if branch.age_days > 30 and not branch.is_active: conn.delete_branch(branch.name) print(f"Deleted old branch: {branch.name}")Time-Travel Queries
Query historical data states without maintaining separate backup databases.
Basic Time-Travel Queries
import heliosdb_sqlite as sqlite3from datetime import datetime, timedelta
conn = sqlite3.connect('app.db', enable_time_travel=True)cursor = conn.cursor()
# Query current statecursor.execute('SELECT * FROM users WHERE id = 123')current_user = cursor.fetchone()print(f"Current: {current_user}")
# Query state from 1 hour agoone_hour_ago = datetime.now() - timedelta(hours=1)cursor.execute(f''' SELECT * FROM users WHERE id = 123 AS OF TIMESTAMP '{one_hour_ago.isoformat()}'''')past_user = cursor.fetchone()print(f"1 hour ago: {past_user}")
# Query state from specific datecursor.execute(''' SELECT * FROM products AS OF TIMESTAMP '2024-01-01 00:00:00'''')products_jan1 = cursor.fetchall()print(f"Products on Jan 1: {len(products_jan1)}")Tracking Changes Over Time
# Find when a record changedcursor.execute(''' SELECT * FROM users WHERE id = 123 FOR SYSTEM_TIME BETWEEN TIMESTAMP '2024-01-01 00:00:00' AND TIMESTAMP '2024-01-31 23:59:59'''')
history = cursor.fetchall()print(f"User had {len(history)} changes in January")
for timestamp, user_data in history: print(f"{timestamp}: {user_data}")Audit and Compliance
def audit_user_changes(user_id, start_date, end_date): """Generate audit report of all changes to a user""" conn = sqlite3.connect('secure.db', enable_time_travel=True) cursor = conn.cursor()
cursor.execute(''' SELECT system_time, id, name, email, role, modified_by FROM users WHERE id = ? FOR SYSTEM_TIME BETWEEN ? AND ? ORDER BY system_time ''', (user_id, start_date, end_date))
changes = cursor.fetchall()
print(f"Audit Report for User {user_id}") print(f"Period: {start_date} to {end_date}") print(f"Total changes: {len(changes)}\n")
for i, (timestamp, uid, name, email, role, modified_by) in enumerate(changes): if i > 0: print(f"\n--- Change {i} at {timestamp} by {modified_by} ---") prev = changes[i-1] if name != prev[2]: print(f"Name: {prev[2]} → {name}") if email != prev[3]: print(f"Email: {prev[3]} → {email}") if role != prev[4]: print(f"Role: {prev[4]} → {role}") else: print(f"Initial state at {timestamp}: {name}, {email}, {role}")
conn.close()
# Usageaudit_user_changes(123, '2024-01-01', '2024-12-31')Rollback to Previous State
def rollback_table_to_timestamp(table_name, target_timestamp): """Restore table to a previous state""" conn = sqlite3.connect('app.db', enable_time_travel=True) cursor = conn.cursor()
# Get historical data cursor.execute(f''' SELECT * FROM {table_name} AS OF TIMESTAMP '{target_timestamp}' ''') historical_data = cursor.fetchall()
# Get column names cursor.execute(f'PRAGMA table_info({table_name})') columns = [col[1] for col in cursor.fetchall()]
# Clear current data cursor.execute(f'DELETE FROM {table_name}')
# Restore historical data placeholders = ','.join(['?' for _ in columns]) cursor.executemany( f'INSERT INTO {table_name} VALUES ({placeholders})', historical_data )
conn.commit() print(f"Restored {table_name} to state at {target_timestamp}") print(f"Restored {len(historical_data)} rows")
conn.close()
# Usage: rollback accidental deletionrollback_table_to_timestamp('users', '2024-06-01 10:00:00')Encryption at Rest
Protect sensitive data with transparent database encryption.
Enabling Encryption
import heliosdb_sqlite as sqlite3import os
# Generate or retrieve encryption key (keep this secure!)encryption_key = os.getenv('DB_ENCRYPTION_KEY')
# Create encrypted databaseconn = sqlite3.connect('secure.db', encryption_key=encryption_key)cursor = conn.cursor()
# Use normally - encryption is transparentcursor.execute('CREATE TABLE secrets (id INTEGER, data TEXT)')cursor.execute('INSERT INTO secrets VALUES (1, "sensitive information")')conn.commit()
# Data is encrypted on diskconn.close()
# To read, must provide the same keyconn = sqlite3.connect('secure.db', encryption_key=encryption_key)cursor = conn.cursor()cursor.execute('SELECT * FROM secrets')print(cursor.fetchall()) # Decrypted automaticallyKey Rotation
def rotate_encryption_key(db_path, old_key, new_key): """Safely rotate database encryption key""" import tempfile import shutil
# Open with old key conn_old = sqlite3.connect(db_path, encryption_key=old_key)
# Create temporary database with new key temp_db = tempfile.NamedTemporaryFile(delete=False, suffix='.db') conn_new = sqlite3.connect(temp_db.name, encryption_key=new_key)
# Copy all data for line in conn_old.iterdump(): conn_new.execute(line) conn_new.commit()
conn_old.close() conn_new.close()
# Replace old database shutil.move(temp_db.name, db_path) print("Encryption key rotated successfully")
# Usageold_key = os.getenv('OLD_DB_KEY')new_key = os.getenv('NEW_DB_KEY')rotate_encryption_key('secure.db', old_key, new_key)Encrypted Columns
# Encrypt specific columns while keeping others searchableconn = sqlite3.connect('mixed.db', encryption_key=encryption_key)cursor = conn.cursor()
cursor.execute(''' CREATE TABLE users ( id INTEGER PRIMARY KEY, username TEXT, -- Not encrypted, searchable email TEXT ENCRYPTED, -- Encrypted column password_hash TEXT ENCRYPTED )''')
# Insert with automatic column encryptioncursor.execute(''' INSERT INTO users (username, email, password_hash) VALUES (?, ?, ?)''', ('alice', 'alice@example.com', 'hashed_password'))
# Search on non-encrypted columns (fast)cursor.execute('SELECT * FROM users WHERE username = ?', ('alice',))
# Encrypted columns are decrypted automatically on readuser = cursor.fetchone()print(f"Email: {user[2]}") # DecryptedServer Mode Switching
Run HeliosDB Nano as a PostgreSQL-compatible server for multi-client access.
Starting Server Mode
import heliosdb_sqlite as sqlite3
# Start database in server modeserver = sqlite3.start_server( database='app.db', host='0.0.0.0', port=5432, max_connections=100)
print(f"HeliosDB Nano server listening on port {server.port}")print(f"Connect using: postgresql://localhost:5432/app")
# Server runs in background, application continues# Clients can connect using standard PostgreSQL drivers
# Stop server when done# server.stop()Connecting as Client
# From another Python script or machineimport psycopg2 # Standard PostgreSQL driver
# Connect to HeliosDB Nano serverconn = psycopg2.connect( host='localhost', port=5432, database='app', user='helios', password='')
cursor = conn.cursor()cursor.execute('SELECT * FROM users')users = cursor.fetchall()print(users)
conn.close()Server Configuration
# Advanced server configurationserver = sqlite3.start_server( database='app.db', host='0.0.0.0', port=5432, max_connections=200, connection_timeout=30, enable_ssl=True, ssl_cert='server.crt', ssl_key='server.key', auth_method='password', # or 'certificate' allowed_ips=['192.168.1.0/24', '10.0.0.0/8'])
# Monitor server statusstatus = server.get_status()print(f"Active connections: {status.active_connections}")print(f"Total queries: {status.total_queries}")print(f"Uptime: {status.uptime_seconds}s")Hybrid Mode Usage
Combine embedded and server modes for flexible architecture.
Hybrid Mode Setup
import heliosdb_sqlite as sqlite3
# Start in hybrid modeconn = sqlite3.connect('app.db', heliosdb_mode='hybrid')
# Use as embedded database (fast, local)cursor = conn.cursor()cursor.execute('SELECT COUNT(*) FROM users')local_count = cursor.fetchone()[0]print(f"Local access: {local_count} users")
# Also accessible via network (for other clients)# Server automatically started in backgroundserver_info = conn.get_server_info()print(f"Also available at: {server_info.url}")
# Other applications can connect remotely# while this application uses local accessUse Cases for Hybrid Mode
# Use Case 1: Web application with admin tools# Web app uses local embedded access for speedimport heliosdb_sqlite as sqlite3from flask import Flask
app = Flask(__name__)db_conn = sqlite3.connect('app.db', heliosdb_mode='hybrid')
@app.route('/api/users')def get_users(): # Fast local access cursor = db_conn.cursor() cursor.execute('SELECT * FROM users') return jsonify(cursor.fetchall())
# Meanwhile, admin can connect remotely for management:# $ psql -h localhost -p 5432 -U helios app
# Use Case 2: Local ML training with remote monitoringimport heliosdb_sqlite as sqlite3
# ML training script uses local accessconn = sqlite3.connect('training.db', heliosdb_mode='hybrid')
def train_model(): cursor = conn.cursor() for epoch in range(100): # Fast local writes cursor.execute('INSERT INTO training_log VALUES (?, ?, ?)', (epoch, loss, accuracy)) conn.commit()
# Monitoring dashboard connects remotely# to visualize training progress in real-timePerformance Tuning for HeliosDB Nano
Optimize HeliosDB Nano for your specific workload.
Connection Pooling
from heliosdb_sqlite import connection_pool
# Create connection poolpool = connection_pool.create( database='app.db', pool_size=20, max_overflow=10, timeout=30, recycle=3600 # Recycle connections after 1 hour)
# Use connections from poolwith pool.connection() as conn: cursor = conn.cursor() cursor.execute('SELECT * FROM users') users = cursor.fetchall()
# Connection automatically returned to poolQuery Optimization
conn = sqlite3.connect('app.db')
# Enable query plan analysisconn.enable_query_analyzer()
# Run your queriescursor = conn.cursor()cursor.execute('SELECT * FROM users WHERE created_at > ?', ('2024-01-01',))
# Get optimization suggestionssuggestions = conn.get_query_suggestions()for suggestion in suggestions: print(f"Query: {suggestion.query}") print(f"Current plan: {suggestion.current_plan}") print(f"Suggested index: {suggestion.suggested_index}") print(f"Expected speedup: {suggestion.speedup_factor}x")Batch Operations
# Efficient batch insertsconn = sqlite3.connect('app.db')cursor = conn.cursor()
# Use batch_insert for better performanceusers = [ ('user1', 'email1@example.com'), ('user2', 'email2@example.com'), # ... thousands of users]
# Traditional way (slower)# for user in users:# cursor.execute('INSERT INTO users VALUES (?, ?)', user)
# Optimized way (faster)cursor.batch_insert('users', ['name', 'email'], users)conn.commit()
print(f"Inserted {len(users)} users efficiently")Caching Strategies
# Enable query result cachingconn = sqlite3.connect('app.db', enable_cache=True, cache_size_mb=100)
# First query: hits databasecursor = conn.cursor()cursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))products1 = cursor.fetchall() # Slow (first time)
# Same query: served from cachecursor.execute('SELECT * FROM products WHERE category = ?', ('electronics',))products2 = cursor.fetchall() # Fast (cached)
# Cache statisticsstats = conn.get_cache_stats()print(f"Cache hit rate: {stats.hit_rate:.2%}")print(f"Queries served from cache: {stats.cache_hits}")Gradual Migration to Full HeliosDB Nano Features
Incrementally adopt advanced features as your application grows.
Phase 1: Drop-In Replacement
# Start simple: just replace importimport heliosdb_sqlite as sqlite3
# Everything else unchangedconn = sqlite3.connect('app.db')# ... existing code ...Phase 2: Enable Better Concurrency
# No code changes needed!# HeliosDB Nano automatically handles concurrent writes better# Your existing multi-threaded/async code just works betterPhase 3: Add Vector Search
# Add vector search to one featureconn = sqlite3.connect('app.db', enable_vector_search=True)
# Add vector column to existing tablecursor = conn.cursor()cursor.execute('ALTER TABLE articles ADD COLUMN embedding VECTOR(384)')
# Populate embeddings for existing datacursor.execute('SELECT id, content FROM articles')for article_id, content in cursor.fetchall(): embedding = generate_embedding(content) cursor.execute('UPDATE articles SET embedding = ? WHERE id = ?', (embedding, article_id))conn.commit()
# Now you have semantic search!Phase 4: Enable Time-Travel for Audit
# Enable time-travel on sensitive tablesconn = sqlite3.connect('app.db', enable_time_travel=True)
# Add system versioning to audit-critical tablescursor.execute(''' ALTER TABLE financial_transactions ADD SYSTEM VERSIONING''')
# Now you have automatic audit trailPhase 5: Create Development Branches
# Use branching for safe testingconn = sqlite3.connect('app.db')
# Before risky operation, create branchtest_branch = conn.create_branch('test_migration')test_conn = sqlite3.connect('app.db', branch='test_migration')
# Test your changestest_conn.execute('ALTER TABLE users ADD COLUMN new_field TEXT')
# If successful, merge; if not, discardif tests_pass(): conn.merge_branch('test_migration')else: conn.delete_branch('test_migration')Phase 6: Scale with Server Mode
# When you outgrow single-machine setupserver = sqlite3.start_server( database='app.db', host='0.0.0.0', port=5432, max_connections=200)
# Application can still use local access for speed# But now supports remote clients tooSummary
HeliosDB Nano advanced features are opt-in and gradual:
-
Start with drop-in SQLite compatibility
-
Add features as needed:
- Vector search for AI/ML applications
- Time-travel for audit and compliance
- Branching for safe experimentation
- Encryption for security
- Server mode for scale
- Hybrid mode for flexibility
-
No need to adopt everything at once
-
Each feature enhances, doesn’t replace, basic functionality
-
All features work together seamlessly
Remember: You can stay on basic SQLite compatibility forever if that meets your needs. Advanced features are there when you’re ready.
For more information:
- Drop-in guide:
HELIOSDB_SQLITE_DROP_IN_GUIDE.md - Migration patterns:
HELIOSDB_SQLITE_MIGRATION_PATTERNS.md - FAQ:
HELIOSDB_SQLITE_FAQ.md