Skip to content

HeliosDB Nano AI Agent & Automation Technical Guide

HeliosDB Nano AI Agent & Automation Technical Guide

Overview

This guide is designed for AI agents, automation systems, and programmatic access to HeliosDB Nano. It provides structured information optimized for parsing and integration by intelligent systems.


1. System Specifications

Database Engine

  • Type: PostgreSQL-compatible embedded database
  • Storage: RocksDB with LSM-tree architecture
  • Transaction Model: MVCC with snapshot isolation
  • Concurrency: ACID transactions with row-level locking
  • Query Language: SQL (95%+ PostgreSQL compatibility)

Supported Deployment Modes

ModeLocationConcurrencyUse Case
embeddedIn-processSingle processDesktop apps, services
serverNetworkMultiple clientsWeb apps, APIs
in_memoryRAM onlySingle processTests, analytics
hybridDisk + cacheMultiple clientsProduction systems

Resource Requirements

ComponentMinimumRecommended
Memory50 MB512 MB - 4 GB
Disk10 MB100 MB - 10 GB
CPU1 core2+ cores

2. API Integration Points

Client Library APIs

Rust API (Native)

use heliosdb_nano::EmbeddedDatabase;
// Core operations
db.execute(sql: &str, params: &[Value]) -> Result<()>;
db.query(sql: &str, params: &[Value]) -> Result<Vec<Row>>;
db.begin_transaction() -> Result<Transaction>;
// Vector operations
db.create_vector_index(table: &str, column: &str, config: IndexConfig) -> Result<()>;
db.vector_search(store: &str, query: Vec<f32>, top_k: usize) -> Result<Vec<SearchResult>>;
// Branching
db.create_branch(name: &str, from: Option<&str>) -> Result<()>;
db.merge_branch(source: &str, target: &str) -> Result<()>;
// Time-travel
db.query_as_of_timestamp(sql: &str, params: &[Value], timestamp: &str) -> Result<Vec<Row>>;

PostgreSQL Wire Protocol

Supported for server mode (port 5432 by default):

postgresql://[user[:password]@][host][:port]/[database]
Example:
postgresql://localhost:5432/mydb
postgresql://user:pass@db.example.com:5432/production

Compatible clients:

  • psql (PostgreSQL CLI)
  • Python: psycopg2, asyncpg, sqlalchemy
  • JavaScript/Node.js: pg, node-postgres
  • Rust: sqlx, tokio-postgres, diesel
  • Java: jdbc
  • Go: database/sql, pgx
  • .NET: Npgsql

SDK Integrations

Official SDKs Available:

LanguageStatusLocation
Python✅ v3.0sdks/python/
TypeScript✅ v3.0sdks/typescript/
Go✅ v3.0sdks/go/
Rust✅ v3.0sdks/rust/

3. Query Specification

SQL Dialect

Supported Features:

  • DML: SELECT, INSERT, UPDATE, DELETE
  • DDL: CREATE/DROP/ALTER TABLE, INDEX, VIEW
  • Transactions: BEGIN, COMMIT, ROLLBACK
  • Temporal: AS OF TIMESTAMP/TRANSACTION/SCN
  • Vector operations: Vector types, HNSW indexes
  • Functions: Aggregate, scalar, window functions
  • Joins: INNER, LEFT, RIGHT, FULL OUTER, CROSS

Parameterized Query Format:

SELECT * FROM table WHERE column = $1 AND other = $2;
-- Parameter indices: $1, $2, $3, ... (1-indexed)

Data Types

TypeSizeExample
BOOLEAN1 bytetrue, false
INT22 bytes-32768 to 32767
INT44 bytes-2147483648 to 2147483647
INT88 bytes-2^63 to 2^63-1
FLOAT44 bytesSingle precision
FLOAT88 bytesDouble precision
DECIMAL(p,s)Variable123.45
TEXTVariable’string’
VARCHAR(n)Variable’string’
BYTEAVariableBinary data
TIMESTAMP8 bytes2025-01-15 12:00:00
DATE4 bytes2025-01-15
JSONVariable{“key”: “value”}
JSONBVariable{“key”: “value”}
VECTOR(n)Variable[0.1, 0.2, …]
UUID16 bytes550e8400-e29b-41d4-a716-446655440000
ARRAY[T]VariableARRAY[1,2,3]

Vector Operations

Vector Type Definition:

-- Define columns with vector type
column_name VECTOR(dimensions)
-- Example: 1536-dimensional embedding (OpenAI)
embedding VECTOR(1536)

Distance Operators:

OperatorMetricDistance
<=>Cosine0 to 2
<#>Euclidean≥ 0
<+>Manhattan≥ 0
<~>Dot Product-∞ to ∞

Index Creation:

CREATE INDEX idx_name ON table USING hnsw (vector_column distance_ops)
WITH (
m = 16, -- HNSW connection count
ef_construction = 200, -- Build quality
quantization = 'pq', -- Optional compression
pq_subquantizers = 8 -- PQ settings
);

4. System Views & Monitoring

Information Schema Views

Available system views for introspection:

-- List all tables
SELECT * FROM information_schema.tables WHERE table_schema = 'public';
-- List all columns
SELECT * FROM information_schema.columns WHERE table_name = 'table_name';
-- List all indexes
SELECT * FROM pg_indexes;
-- List all views
SELECT * FROM information_schema.views WHERE table_schema = 'public';
-- Database statistics
SELECT * FROM pg_stat_user_tables;
-- Index statistics
SELECT * FROM pg_stat_user_indexes;
-- Vector index statistics
SELECT * FROM pg_vector_index_stats;
-- Branch information
SELECT * FROM pg_database_branches();
-- Materialized view staleness
SELECT * FROM pg_mv_staleness();
-- Current database size
SELECT pg_database_size(current_database());

5. Configuration Parameters

Configuration File (heliosdb.toml)

[storage]
path = "./heliosdb-data" # Data directory
cache_size = 536870912 # 512 MB
compression = "zstd" # zstd, lz4, none
in_memory = false # Disable persistence
[server]
listen_addr = "0.0.0.0" # Listen address
port = 5432 # PostgreSQL port
max_connections = 100 # Connection pool size
connection_timeout_sec = 30 # Connection timeout
[performance]
enable_simd = true # SIMD acceleration
worker_threads = 4 # Query worker threads
parallel_query = true # Parallel query execution
buffer_pool_size = 268435456 # 256 MB
[encryption]
enabled = false # Encryption at rest
algorithm = "aes256-gcm" # AES-256-GCM
key_source = { environment = "HELIOSDB_KEY" }
[vector]
enabled = true # Vector search support
pq_enabled = true # Product Quantization
pq_auto_compress = true # Auto-compress vectors
pq_training_size = 10000 # PQ training sample size
[temporal]
time_travel_enabled = true # Time-travel queries
snapshot_retention_days = 30 # Keep snapshots for N days

Environment Variables

Terminal window
# Server mode
HELIOSDB_PORT=5432
HELIOSDB_LISTEN_ADDR=0.0.0.0
HELIOSDB_DATA_DIR=./data
# Encryption
HELIOSDB_ENCRYPTION_KEY=your-32-byte-key
# Performance
HELIOSDB_WORKER_THREADS=4
HELIOSDB_CACHE_SIZE=536870912
# Logging
HELIOSDB_LOG_LEVEL=info # trace, debug, info, warn, error
RUST_LOG=heliosdb_nano=debug

6. Error Handling & Status Codes

PostgreSQL Error Codes

Standard PostgreSQL SQLSTATE codes are used:

CodeClassMeaning
00000SUCCESSCommand successful
01000WARNINGWarning (non-fatal)
08000CONNECTIONConnection failure
42P01UNDEFINED_TABLETable does not exist
42P09UNDEFINED_OBJECTObject does not exist
23505UNIQUE_VIOLATIONUnique constraint violation
23503FOREIGN_KEYForeign key constraint violation
25001IN_FAILED_SQL_TRANSACTIONTransaction in failed state

Structured Error Response

{
"error": {
"code": "42P01",
"message": "relation \"table_name\" does not exist",
"detail": "Table 'table_name' was not found in schema 'public'",
"severity": "ERROR",
"sqlstate": "42P01"
}
}

7. Performance Characteristics

Query Performance

OperationThroughputLatency
Single row lookup100K+ QPS<1ms
Full table scan500K-1M rows/secVariable
Vector search (k=10)1K-5K QPS<100ms
Aggregation500K rows/secVariable
Join100K-500K rows/secVariable

Compression Ratios

Data TypeMethodRatio
NumericALP2-10x
StringFSST2-5x
Vector (768-dim)PQ (u8)384x
Vector (1536-dim)PQ (u8)768x

Memory Usage

Base system: ~50 MB
Per connection: ~1-2 MB
Cache (configurable): 50 MB to 4 GB
Index overhead: ~10-20% of data size

8. Integration Patterns

Pattern 1: Embedded Database

Use Case: Local application storage

// Initialization
let db = EmbeddedDatabase::new("./app.helio")?;
// Execute query
let result = db.query("SELECT * FROM users", &[])?;
// Proper cleanup
drop(db); // Implicit SYNC

Lifecycle:

  1. Open database file
  2. Load schema and indexes
  3. Execute queries
  4. Sync data on drop

Concurrency: Single process with multi-threaded internal handling

Pattern 2: Server Mode

Use Case: Network accessible database

Terminal window
# Start server
heliosdb-nano start --port 5432 --data ./mydb
# Client connection (any language)
Connection -> PostgreSQL wire protocol -> Server -> Database

Lifecycle:

  1. Start server process
  2. Accept client connections
  3. Maintain connection pool
  4. Replicate transactions to disk

Concurrency: Multiple client connections (configurable max)

Pattern 3: Vector Search Integration

Use Case: Semantic search in applications

-- Storage
CREATE TABLE documents (
id SERIAL,
content TEXT,
embedding VECTOR(1536)
);
CREATE INDEX idx_embedding ON documents
USING hnsw (embedding vector_cosine_ops);
-- Insert with embedding
INSERT INTO documents VALUES
(1, 'text', array[0.1, 0.2, ...]);
-- Search
SELECT id, content,
embedding <=> ARRAY[0.15, 0.25, ...] AS distance
FROM documents
ORDER BY distance
LIMIT 10;

Pattern 4: Time-Travel Queries

Use Case: Historical data analysis

-- Current data
SELECT * FROM table;
-- Historical data
SELECT * FROM table AS OF TIMESTAMP '2025-01-01 00:00:00';
-- Audit trail
SELECT id, action, timestamp
FROM audit_log
WHERE occurred_at >= '2025-01-01';

9. Compatibility Matrix

PostgreSQL Feature Support

FeatureSupportNotes
SQL (DML/DDL)95%+Full support for common operations
TransactionsMVCC, snapshot isolation
IndexesB-tree, HNSW
Extensions⚠️No PostGIS, pgvector
Replication⚠️Experimental
Procedures⚠️Limited
TriggersNot supported
Foreign DataNot supported

ORM Compatibility

FrameworkLanguageStatus
SQLAlchemyPython✅ Full
DieselRust✅ Full
PrismaTypeScript✅ Full
SQLxRust✅ Full
Entity Framework.NET✅ Full

10. Operational Parameters

Health Checks

-- Basic connectivity
SELECT 1;
-- Database availability
SELECT pg_database_size(current_database());
-- Index health
SELECT * FROM pg_stat_user_indexes;
-- Transaction activity
SELECT * FROM pg_stat_activity;

Maintenance Operations

-- VACUUM (cleanup dead tuples)
VACUUM ANALYZE;
-- REINDEX (rebuild indexes)
REINDEX DATABASE;
-- Backup
BACKUP TO 's3://bucket/backup.sql';
-- Restore
RESTORE FROM 's3://bucket/backup.sql';

Diagnostic Queries

-- Slow queries (>1 second)
SELECT query, mean_time FROM pg_stat_statements
WHERE mean_time > 1000
ORDER BY mean_time DESC;
-- Table bloat
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Connection status
SELECT * FROM pg_stat_activity;

11. SDK Usage Patterns

Python

import heliosdb_client
# Connect
db = heliosdb_client.Client(
base_url="http://localhost:8080",
api_key="key"
)
# Query
result = await db.query("SELECT * FROM users WHERE id = $1", [1])
# Vector search
results = await db.vector_search("documents", "hello world", top_k=5)
# Memory
memory = db.memory("session_id")
await memory.add("user", "Hello")

TypeScript

import { HeliosDBClient } from 'heliosdb-client';
const db = new HeliosDBClient({
baseUrl: 'http://localhost:8080',
apiKey: 'key'
});
const result = await db.query(
'SELECT * FROM users WHERE id = $1',
[1]
);
const vectors = await db.vectorSearch(
'documents',
'hello world',
{ topK: 5 }
);

Go

import "github.com/heliosdb/go-client"
db := heliosdb.NewClient(
"http://localhost:8080",
"api-key",
)
result, err := db.Query(ctx, "SELECT * FROM users", []interface{}{})
vectors, err := db.VectorSearch(ctx, "documents", "hello", &heliosdb.SearchOptions{
TopK: 5,
})

12. Schema Examples

Example 1: E-commerce

CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email TEXT UNIQUE,
name TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
description TEXT,
price DECIMAL(10,2),
embedding VECTOR(1536)
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers,
total DECIMAL(10,2),
status TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Indexes
CREATE INDEX idx_customer_email ON customers(email);
CREATE INDEX idx_order_customer ON orders(customer_id);
CREATE INDEX idx_product_embedding ON products
USING hnsw (embedding vector_cosine_ops);
-- Materialized view
CREATE MATERIALIZED VIEW customer_stats AS
SELECT customer_id, COUNT(*) as order_count, SUM(total) as lifetime_value
FROM orders GROUP BY customer_id;

Example 2: Time-Series Analytics

CREATE TABLE metrics (
id SERIAL PRIMARY KEY,
sensor_id INT,
metric_name TEXT,
value FLOAT8,
timestamp TIMESTAMP DEFAULT NOW(),
metadata JSONB
);
CREATE INDEX idx_sensor_time ON metrics(sensor_id, timestamp DESC);
CREATE MATERIALIZED VIEW hourly_avg AS
SELECT
sensor_id,
metric_name,
DATE_TRUNC('hour', timestamp) as hour,
AVG(value) as avg_value,
MAX(value) as max_value,
MIN(value) as min_value
FROM metrics
GROUP BY sensor_id, metric_name, DATE_TRUNC('hour', timestamp);
ALTER MATERIALIZED VIEW hourly_avg SET (
auto_refresh = true,
staleness_threshold_sec = 300
);

13. Troubleshooting Guide for Automation

Connectivity Issues

Problem: Cannot connect to database

Diagnostic Steps:

-- 1. Test local connectivity
SELECT 1;
-- 2. Check server status
SELECT version();
-- 3. Check active connections
SELECT count(*) FROM pg_stat_activity;
-- 4. Check configuration
SHOW listen_addresses;
SHOW port;

Performance Issues

Problem: Slow queries

Analysis:

-- Enable query logging
SET log_min_duration_statement = 1000;
-- Analyze query plan
EXPLAIN ANALYZE SELECT ...;
-- Check statistics
ANALYZE table_name;
-- Check missing indexes
SELECT schemaname, tablename, indexname
FROM pg_indexes
WHERE schemaname NOT IN ('pg_catalog', 'information_schema');

Resource Issues

Problem: Out of memory or disk space

Diagnosis:

-- Database size
SELECT pg_database_size(current_database());
-- Table sizes
SELECT tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Clear old data
DELETE FROM table_name WHERE created_at < '2024-01-01';
VACUUM ANALYZE;

References