Skip to content

HeliosDB Nano Documentation

HeliosDB Nano Documentation

Complete documentation for HeliosDB Nano v3.4.0 - PostgreSQL-compatible embedded database with vector search, time-travel queries, and AI capabilities.

🚀 Start Here

For New Users

For Developers

For AI/Automation Systems

For Operators & DevOps


📚 Documentation Index

For complete documentation navigation, see DOCUMENTATION_INDEX.md

Directory Structure

📐 Architecture

Technical architecture and design documents:

📖 Guides

User-facing guides and tutorials:

📋 Planning

Project planning and specification documents:

📊 Reports

Implementation reports and status updates:

🔄 Migration

Repository migration and compatibility documentation:

🛠️ Implementation

Detailed implementation documentation:

📅 Sessions

Development session summaries and deliverables:

📢 Social

Community announcements and social media content:

🚀 Release

Release management documentation:

Getting Started:

For Developers:

For Users:


✨ HeliosDB Nano v3.4.0 Features

Core Database Features

SQL & Query Engine

  • ✅ Full PostgreSQL-compatible SQL support
  • ✅ Transactions with ACID guarantees (Serializable isolation)
  • ✅ Multi-Version Concurrency Control (MVCC)
  • ✅ Complex joins (inner, outer, cross, self-joins)
  • ✅ Subqueries and CTEs (Common Table Expressions)
  • ✅ Window functions and aggregations
  • ✅ 100+ built-in SQL functions
  • ✅ User-defined functions support

Data Types & Storage

  • ✅ Integers (INT, BIGINT, SMALLINT)
  • ✅ Floating point (FLOAT, DOUBLE)
  • ✅ Strings (TEXT, VARCHAR, CHAR)
  • ✅ Dates & times (DATE, TIMESTAMP, TIME, INTERVAL)
  • ✅ JSONB/JSON support with operators
  • ✅ Binary data (BLOB, BYTEA)
  • ✅ Booleans
  • ✅ Vectors (n-dimensional embeddings)
  • ✅ UUID support

Indexing & Optimization

  • ✅ B-tree indices (fast range scans)
  • ✅ Hash indices (fast equality)
  • ✅ HNSW vector indices (fast semantic search)
  • ✅ Bitmap indices (efficient filtering)
  • ✅ Composite indices (multi-column)
  • ✅ Partial indices (conditional)
  • ✅ Query optimizer with cost-based planning
  • ✅ Automatic index selection

AI & Vector Capabilities

Vector Search

  • ✅ HNSW (Hierarchical Navigable Small World) indexing
  • ✅ Multiple distance metrics (cosine, Euclidean, Manhattan, dot product)
  • ✅ Vector quantization (8-bit, product quantization)
  • ✅ SIMD-accelerated distance calculations
  • ✅ <1ms query latency on 1M+ vectors
  • ✅ Batch vector operations
  • ✅ Approximate nearest neighbor search
  • ✅ Exact nearest neighbor search

Embedding Management

  • ✅ Native vector embeddings (any dimension)
  • ✅ Embedding computation (via UDFs or external)
  • ✅ Vector similarity operations
  • ✅ Bulk vector import/export
  • ✅ Vector schema validation
  • ✅ Dimension-aware indexing

LLM Integration

  • ✅ Semantic search for RAG pipelines
  • ✅ Vector storage for embeddings
  • ✅ Full-text search with vector ranking
  • ✅ Hybrid queries (SQL + vector combined)
  • ✅ Context retrieval for prompt engineering

Data Compression

Compression Algorithms

  • ✅ ALP codec (Adaptive Lossless Prediction)
  • ✅ FSST codec (Fast Static Symbol Table)
  • ✅ Zstandard compression
  • ✅ Dictionary compression
  • ✅ Run-length encoding (RLE)
  • ✅ Delta encoding (for time-series)
  • ✅ Bit-packing (for integers)

Compression Performance

  • ✅ 40-100x compression on time-series data
  • ✅ 10-20x compression on typical data
  • ✅ Per-column compression selection
  • ✅ Transparent (automatic on write)
  • ✅ Instant decompression (SIMD-accelerated)

Advanced Features

Multi-Tenancy

  • ✅ Database branching for tenant isolation
  • ✅ Cryptographic branch identifiers
  • ✅ Per-tenant snapshots
  • ✅ Branch-specific indices
  • ✅ Branch merging & replication
  • ✅ Time-travel queries per branch

Time-Travel & Historical Data

  • ✅ Query data as of any point in time
  • ✅ Temporal tables with valid-time tracking
  • ✅ Branching for what-if analysis
  • ✅ Full history retention (configurable)
  • ✅ Snapshot isolation guarantees

Materialized Views

  • ✅ Automatic view materialization
  • ✅ Incremental refresh (delta computation)
  • ✅ Scheduled refresh (background jobs)
  • ✅ Query optimization through view selection
  • ✅ Cascade invalidation handling

Transactions & ACID

  • ✅ ACID compliance (Atomicity, Consistency, Isolation, Durability)
  • ✅ Serializable isolation (strongest guarantee)
  • ✅ Snapshot isolation
  • ✅ Read-committed isolation
  • ✅ Read-uncommitted isolation
  • ✅ Deadlock detection & resolution
  • ✅ Transaction rollback & savepoints
  • ✅ Lock-free reads (via MVCC)

Performance Features

Query Optimization

  • ✅ Cost-based query optimizer
  • ✅ Join order optimization
  • ✅ Predicate pushdown
  • ✅ Index selection (automatic)
  • ✅ Parallel query execution
  • ✅ Vectorized processing (SIMD)
  • ✅ Query result caching
  • ✅ Plan caching

Concurrency & Scaling

  • ✅ Lock-free reads
  • ✅ Concurrent writers (single writer)
  • ✅ Parallel scans
  • ✅ Batch operations
  • ✅ Connection pooling
  • ✅ Resource limits (memory, CPU)
  • ✅ Horizontal scaling (via embedding)

Benchmarked Performance

  • ✅ Sub-millisecond latency (<1ms P50)
  • ✅ <10ms P99 latency for complex queries
  • ✅ 50,000+ queries per second (QPS)
  • ✅ 10M+ events per day ingestion
  • ✅ 500K+ sensor readings per second
  • ✅ 100M+ row analytics queries < 500ms

Deployment & Operations

Deployment Modes

  • ✅ Embedded mode (in-process, no network)
  • ✅ Server mode (separate process, TCP)
  • ✅ In-memory mode (for testing)
  • ✅ Hybrid mode (embedded + optional replication)
  • ✅ Edge deployment (low-memory optimized)
  • ✅ Cloud deployment (Kubernetes-ready)

Infrastructure

  • ✅ Docker containerization
  • ✅ Kubernetes StatefulSets
  • ✅ Cloud provider support (AWS, Azure, GCP)
  • ✅ Persistent volumes (EBS, Azure Disk, etc.)
  • ✅ High availability (replication-ready)
  • ✅ Disaster recovery (backup/restore)

Management & Monitoring

  • ✅ System tables for monitoring
  • ✅ Performance metrics (latency, throughput)
  • ✅ Query profiling
  • ✅ Index usage analytics
  • ✅ Space usage reporting
  • ✅ Lock monitoring
  • ✅ Transaction tracking
  • ✅ Health checks

Backup & Recovery

  • ✅ Point-in-time recovery (PITR)
  • ✅ Incremental backups
  • ✅ Backup compression
  • ✅ Encrypted backups
  • ✅ Cross-region backup replication
  • ✅ One-click restore
  • ✅ Backup verification

API & Integration

Client Protocols

  • ✅ PostgreSQL Wire Protocol (pgwire)
  • ✅ REST/HTTP API
  • ✅ WebSocket support
  • ✅ Native SDKs (Rust, Python, TypeScript, Go)
  • ✅ ODBC/JDBC compatibility
  • ✅ ORM support (SQLAlchemy, TypeORM, etc.)

SDK Support

  • ✅ Python SDK (async/await)
  • ✅ TypeScript SDK (Node.js, Browser)
  • ✅ Go SDK (high-performance)
  • ✅ Rust SDK (native integration)
  • ✅ RESTful API client libraries

Connectivity

  • ✅ TLS/SSL encryption
  • ✅ Connection pooling
  • ✅ Configurable timeouts
  • ✅ Retry logic & backoff
  • ✅ Connection authentication
  • ✅ Rate limiting

Security Features

Authentication & Authorization

  • ✅ User authentication (credentials)
  • ✅ Role-based access control (RBAC)
  • ✅ User permissions
  • ✅ Table-level access control
  • ✅ Column-level security (via views)
  • ✅ Row-level security via application

Encryption

  • ✅ TLS 1.3 for connections
  • ✅ Configurable cipher suites
  • ✅ Certificate pinning support
  • ✅ Application-layer encryption
  • ✅ Key rotation support

Audit & Compliance

  • ✅ Audit logging
  • ✅ Query tracking
  • ✅ User action logging
  • ✅ Data access logging
  • ✅ Compliance reporting
  • ✅ GDPR right-to-be-forgotten
  • ✅ Data export for portability

Production Hardening

  • ✅ Rate limiting (built-in)
  • ✅ Lock poisoning recovery (graceful)
  • ✅ Error handling (no panics in production)
  • ✅ Resource limits (memory, connections)
  • ✅ Validation & sanitization
  • ✅ Protection against injection attacks

Developer Experience

REPL & Tooling

  • ✅ Interactive REPL for SQL
  • ✅ REPL history
  • ✅ Multi-line queries
  • ✅ Command history
  • ✅ Output formatting options
  • ✅ Execution timing
  • ✅ Query planning visualization

Error Handling

  • ✅ Detailed error messages
  • ✅ Error codes & categories
  • ✅ Stack traces (debugging)
  • ✅ Suggestion system (helpful hints)
  • ✅ Validation errors (schema)

Documentation

  • ✅ Comprehensive guides (2,500+ lines)
  • ✅ Code examples (50+ samples)
  • ✅ API documentation
  • ✅ Architecture documentation
  • ✅ Security hardening guide
  • ✅ Production deployment guide
  • ✅ Migration guides
  • ✅ Business use cases (4 detailed)

Enterprise Features

Scalability

  • ✅ Horizontal scaling (via containers)
  • ✅ Vertical scaling (increased resources)
  • ✅ Multi-node support (clustering)
  • ✅ Load balancing ready
  • ✅ Auto-scaling triggers

Reliability

  • ✅ 99.99% uptime SLA ready
  • ✅ Crash recovery (WAL)
  • ✅ Transaction durability
  • ✅ Replication support
  • ✅ Failover mechanisms

Multi-Tenancy Enterprise Support

  • ✅ Per-tenant resource limits
  • ✅ Tenant isolation guarantees
  • ✅ Tenant-aware monitoring
  • ✅ Per-tenant quotas
  • ✅ Tenant data separation (at rest)

Compliance & Governance

  • ✅ GDPR compliant (privacy features)
  • ✅ HIPAA ready (encryption, audit logs)
  • ✅ SOC 2 compatible (security controls)
  • ✅ Data retention policies
  • ✅ Compliance reporting

Contributing

When adding new documentation:

  1. Place it in the appropriate category directory
  2. Update this README with a link to the new document
  3. Use clear, descriptive filenames in UPPER_SNAKE_CASE
  4. Include a brief description in this index

Documentation Standards

  • All documentation should be in Markdown format
  • Use clear headings and table of contents for long documents
  • Include code examples where applicable
  • Keep guides practical and action-oriented
  • Update the index when adding new documents