Skip to content

HeliosDB Nano Feature Summary

HeliosDB Nano Feature Summary

Quick Reference Guide - High-level feature overview by category

Core Database Capabilities

SQL Execution

  • Full DML support: SELECT, INSERT, UPDATE, DELETE, TRUNCATE
  • Full DDL support: CREATE/DROP TABLE, CREATE/DROP INDEX
  • Query composition: JOINs, aggregates, subqueries, CTEs
  • Advanced features: Vector search, time-travel queries, parameterized queries

Data Types (20+)

Numeric (Int2/4/8, Float4/8, Numeric), Text (Varchar, Text, Char), Temporal (Date, Time, Timestamp, Timestamptz), Structured (JSON, JSONB, Array), Special (UUID, Vector, Bytea)

SQL Compliance

  • PostgreSQL 17 compatible (95%+ compatibility)
  • Standard SQL syntax
  • Extended features: Vector operations, temporal queries, branching

Phase 3 Features (Complete)

Database Branching

  • Create branches: CREATE DATABASE BRANCH name AS OF timestamp
  • Drop branches: DROP DATABASE BRANCH name [IF EXISTS]
  • Merge branches: MERGE DATABASE BRANCH source INTO target
  • Merge strategies: Auto, Manual, Theirs, Ours
  • Conflict detection and resolution

Time-Travel Queries

  • Query at specific timestamp: SELECT ... AS OF TIMESTAMP '2024-01-15 10:30:00'
  • Query at transaction ID: SELECT ... AS OF TRANSACTION txn_id
  • Query at System Change Number: SELECT ... AS OF SCN 12345
  • <2x performance overhead vs current queries
  • Snapshot caching and garbage collection

Materialized Views

  • Create/refresh/drop views: CREATE MATERIALIZED VIEW name AS query
  • Refresh strategies: Manual, Auto, Incremental
  • CPU-aware scheduler with priority queue
  • Auto-refresh with configurable intervals
  • Incremental refresh with delta tracking
  • Staleness monitoring and tracking

System Views

  • pg_database_branches - Branch metadata and lineage
  • pg_mv_staleness - Materialized view refresh status
  • pg_vector_index_stats - Vector index metrics
  • pg_compression_stats - Storage efficiency

Vector Search (HNSW)

  • Hierarchical Navigable Small World graphs
  • Distance metrics: Cosine, L2, Inner Product
  • SIMD acceleration (AVX2): 2-6x speedup
  • KNN similarity search
  • Multi-metric index support

Vector Quantization

  • Product Quantization for memory efficiency
  • Codebook generation and training
  • Hybrid search (quantized + exact)
  • Automatic memory management

Specialized Indexes

  • GIN Index for JSONB containment queries
  • B-tree indexes for range queries
  • Index metadata and statistics

Storage & Compression

Storage Engine

  • RocksDB LSM-tree based
  • Write-ahead logging (WAL) for durability
  • Atomic writes via WriteBatch
  • Iterator-based scans
  • Multiple compression modes

Compression Codecs

  • FSST: String compression (high ratio, fast)
  • ALP: Numeric compression (pattern-based)
  • AUTO: Automatic codec selection
  • Per-table and per-column configuration

Compression Features

  • Per-row automatic compression on INSERT
  • Lazy decompression on READ
  • Compression statistics and monitoring
  • Compression overhead tracking

Transactions & Consistency

ACID Guarantees

  • Atomicity: All-or-nothing operations
  • Consistency: Schema and data validation
  • Isolation: Snapshot isolation (MVCC)
  • Durability: Write-ahead logging

Transaction Control

  • Explicit: BEGIN, COMMIT, ROLLBACK
  • Implicit: Auto-commit on statements
  • Transaction write sets
  • Nested transaction detection
  • Timeout enforcement

MVCC (Multi-Version Concurrency Control)

  • Snapshot isolation without locks
  • Non-blocking reads
  • Write-your-own-writes
  • Optimistic concurrency

Network & Protocol

PostgreSQL Wire Protocol

  • Full v3.0 protocol compatibility
  • Async/await Tokio implementation
  • Multi-client support
  • Session management per connection
  • Simple and Extended Query protocols

Authentication

  • MD5 password authentication
  • SCRAM-SHA-256 authentication
  • SSL/TLS encrypted connections
  • Per-session user context

Server Mode

  • Embedded or network deployment
  • Standard PostgreSQL client compatibility
  • psql, pgAdmin, Python psycopg2, etc.

Security

Encryption (TDE)

  • AES-256-GCM transparent data encryption
  • Random 96-bit nonce per encryption
  • Argon2 password-based key derivation
  • NIST-standard algorithms
  • Optional encryption per database

Audit Logging

  • DDL operation logging (CREATE, DROP, ALTER)
  • DML operation logging (INSERT, UPDATE, DELETE)
  • Append-only tamper-proof log
  • Cryptographic checksums
  • Compliance: SOC2, HIPAA, GDPR

Developer Tools

Interactive REPL

  • Multi-line SQL editing
  • Command history with persistence
  • Auto-completion (tables, columns, keywords)
  • Result pretty-printing
  • Query timing display

Meta Commands

  • Schema: \d, \dt, \dS
  • Phase 3: \branches, \use, \snapshots, \dmv
  • Admin: \user, \password, \ssl, \server
  • Config: \set, \config, \timing, \compression, \stats

Query Analysis

  • EXPLAIN query plans
  • EXPLAIN ANALYZE with execution stats
  • Index recommendations
  • Query optimization hints
  • Performance metrics

Embedded Database API

Simple Interface

// File-based
let db = EmbeddedDatabase::new("./data")?;
// In-memory
let db = EmbeddedDatabase::new_in_memory()?;
// Execute SQL
db.execute("CREATE TABLE users (id INT, name TEXT)")?;
// Safe parameterized queries
db.execute_params(
"INSERT INTO users VALUES ($1, $2)",
&[Value::Int4(1), Value::String("Alice".to_string())]
)?;
// Query with results
let results = db.query_params(
"SELECT * FROM users WHERE id = $1",
&[Value::Int4(1)]
)?;

Configuration

  • Storage path or in-memory
  • Encryption on/off with key source
  • Compression codec selection
  • Query timeout settings
  • Cache size tuning

Performance Characteristics

  • HNSW index build: O(n log n)
  • KNN query: O(log n) for good graphs
  • SIMD acceleration: 2-6x speedup
  • Quantization: 4-16x memory reduction

Time-Travel

  • Snapshot creation: O(catalog update)
  • AS OF queries: <2x overhead
  • Garbage collection: Configurable retention

Materialized Views

  • Manual refresh: Full recompute
  • Incremental refresh: Delta-based (50-90% faster)
  • Scheduler: CPU-aware with backpressure

Compression

  • FSST: 30-80% reduction for text (fast)
  • ALP: 50-90% reduction for numbers (fast)
  • Storage overhead: <5%

Quality Metrics

Code Coverage

  • Total Lines: ~56,000 Rust
  • Test Pass Rate: >95%
  • Target Pass Rate: >90% ✅
  • Clippy Warnings: Minimal

Version Status

  • Latest: v3.4.0
  • Production Readiness: Production Ready
  • Core DB: Stable and tested

Module Breakdown

  • SQL execution: 100% complete
  • Storage: 100% complete
  • Vector search: 100% complete
  • Branching: 100% complete
  • Time-travel: 100% complete
  • Materialized views: 100% complete
  • Network/protocol: 100% complete

Deployment Options

Embedded (Default)

  • In-process database library
  • SQLite-style usage
  • Zero network overhead
  • Single-process only
  • File or memory-based storage

Network Server

  • PostgreSQL-compatible server
  • Tokio async networking
  • Multi-client support
  • Standard PostgreSQL tools
  • Horizontal deployment

Configuration

  • Memory tuning (cache, buffer sizes)
  • Compression codec selection
  • Encryption on/off
  • Query timeout settings
  • WAL sync modes (fast/safe/durable)

Known Limitations

Current Limitations

  1. Sync protocol experimental
  2. Performance timing tests may vary in CI environments

Not Yet Implemented

  • Distributed sharding (future)
  • Full replication (partial support via branching)
  • Column-level encryption (TDE available)
  • Geospatial indexes (future)

Feature Completeness

CategoryStatusVersion
SQL CoreCompletev1.0+
Vector Search (HNSW/PQ)Completev2.0+
Encryption (TDE)Completev2.0+
Database BranchingCompletev3.0+
Time-Travel QueriesCompletev3.0+
Materialized ViewsCompletev3.0+
PostgreSQL Wire ProtocolCompletev2.0+
System ViewsCompletev3.0+
SMFI (Storage Filtering)Completev3.3+
Multi-Tenancy/RLSCompletev3.4+
Row-Level SecurityCompletev3.4+
CDCCompletev3.4+

Quick Start Examples

Basic Usage

-- Create table
CREATE TABLE users (id SERIAL, name TEXT, email TEXT);
-- Insert data
INSERT INTO users (name, email) VALUES ('Alice', 'alice@example.com');
-- Query
SELECT * FROM users WHERE id = 1;
-- Update
UPDATE users SET email = 'alice.new@example.com' WHERE id = 1;
-- Delete
DELETE FROM users WHERE id = 1;

Vector Search

-- Create vector table
CREATE TABLE embeddings (
id SERIAL,
text TEXT,
embedding VECTOR(768)
);
-- Create HNSW index
CREATE INDEX emb_idx ON embeddings(embedding) USING hnsw;
-- Vector similarity search
SELECT id, text FROM embeddings
ORDER BY embedding <-> '[0.1, 0.2, ..., 0.9]' LIMIT 10;

Time-Travel

-- View data at specific time
SELECT * FROM users
AS OF TIMESTAMP '2024-01-15 10:30:00';
-- View at transaction
SELECT * FROM users
AS OF TRANSACTION 12345;
-- View at system change number
SELECT * FROM users
AS OF SCN 56789;

Database Branching

-- Create development branch
CREATE DATABASE BRANCH dev AS OF NOW;
-- Switch branch
\use dev
-- Make changes
INSERT INTO users (name) VALUES ('Bob');
-- Merge back
MERGE DATABASE BRANCH dev INTO main;

Materialized Views

-- Create materialized view
CREATE MATERIALIZED VIEW active_users AS
SELECT id, name FROM users WHERE active = true;
-- Auto-refresh every hour
CREATE MATERIALIZED VIEW active_users_auto AS
SELECT id, name FROM users WHERE active = true
WITH (refresh_strategy = 'auto', refresh_interval = 3600);
-- Refresh manually
REFRESH MATERIALIZED VIEW active_users;
-- Check staleness
SELECT * FROM pg_mv_staleness;

Documentation

  • Feature Analysis: See FEATURE_ANALYSIS.md for detailed breakdown
  • Installation: See project README
  • Examples: Check examples/ directory
  • Implementation Details: See docs/implementation/
  • Configuration: See src/config.rs and CONFIG.md

Last Updated: December 2025 HeliosDB Nano v3.4.0