Ai Agents
HeliosDB-Lite AI Agent & Automation Technical Guide
Overview
This guide is designed for AI agents, automation systems, and programmatic access to HeliosDB-Lite. It provides structured information optimized for parsing and integration by intelligent systems.
1. System Specifications
Database Engine
- Type: PostgreSQL-compatible embedded database
- Storage: RocksDB with LSM-tree architecture
- Transaction Model: MVCC with snapshot isolation
- Concurrency: ACID transactions with row-level locking
- Query Language: SQL (95%+ PostgreSQL compatibility)
Supported Deployment Modes
| Mode | Location | Concurrency | Use Case |
|---|---|---|---|
embedded | In-process | Single process | Desktop apps, services |
server | Network | Multiple clients | Web apps, APIs |
in_memory | RAM only | Single process | Tests, analytics |
hybrid | Disk + cache | Multiple clients | Production systems |
Resource Requirements
| Component | Minimum | Recommended |
|---|---|---|
| Memory | 50 MB | 512 MB - 4 GB |
| Disk | 10 MB | 100 MB - 10 GB |
| CPU | 1 core | 2+ cores |
2. API Integration Points
Client Library APIs
Rust API (Native)
use heliosdb_lite::EmbeddedDatabase;
// Core operationsdb.execute(sql: &str, params: &[Value]) -> Result<()>;db.query(sql: &str, params: &[Value]) -> Result<Vec<Row>>;db.begin_transaction() -> Result<Transaction>;
// Vector operationsdb.create_vector_index(table: &str, column: &str, config: IndexConfig) -> Result<()>;db.vector_search(store: &str, query: Vec<f32>, top_k: usize) -> Result<Vec<SearchResult>>;
// Branchingdb.create_branch(name: &str, from: Option<&str>) -> Result<()>;db.merge_branch(source: &str, target: &str) -> Result<()>;
// Time-traveldb.query_as_of_timestamp(sql: &str, params: &[Value], timestamp: &str) -> Result<Vec<Row>>;PostgreSQL Wire Protocol
Supported for server mode (port 5432 by default):
postgresql://[user[:password]@][host][:port]/[database]
Example:postgresql://localhost:5432/mydbpostgresql://user:pass@db.example.com:5432/productionCompatible clients:
- psql (PostgreSQL CLI)
- Python:
psycopg2,asyncpg,sqlalchemy - JavaScript/Node.js:
pg,node-postgres - Rust:
sqlx,tokio-postgres,diesel - Java:
jdbc - Go:
database/sql,pgx - .NET:
Npgsql
SDK Integrations
Official SDKs Available:
| Language | Status | Location |
|---|---|---|
| Python | ✅ v3.0 | sdks/python/ |
| TypeScript | ✅ v3.0 | sdks/typescript/ |
| Go | ✅ v3.0 | sdks/go/ |
| Rust | ✅ v3.0 | sdks/rust/ |
3. Query Specification
SQL Dialect
Supported Features:
- DML: SELECT, INSERT, UPDATE, DELETE
- DDL: CREATE/DROP/ALTER TABLE, INDEX, VIEW
- Transactions: BEGIN, COMMIT, ROLLBACK
- Temporal: AS OF TIMESTAMP/TRANSACTION/SCN
- Vector operations: Vector types, HNSW indexes
- Functions: Aggregate, scalar, window functions
- Joins: INNER, LEFT, RIGHT, FULL OUTER, CROSS
Parameterized Query Format:
SELECT * FROM table WHERE column = $1 AND other = $2;-- Parameter indices: $1, $2, $3, ... (1-indexed)Data Types
| Type | Size | Example |
|---|---|---|
| BOOLEAN | 1 byte | true, false |
| INT2 | 2 bytes | -32768 to 32767 |
| INT4 | 4 bytes | -2147483648 to 2147483647 |
| INT8 | 8 bytes | -2^63 to 2^63-1 |
| FLOAT4 | 4 bytes | Single precision |
| FLOAT8 | 8 bytes | Double precision |
| DECIMAL(p,s) | Variable | 123.45 |
| TEXT | Variable | ’string’ |
| VARCHAR(n) | Variable | ’string’ |
| BYTEA | Variable | Binary data |
| TIMESTAMP | 8 bytes | 2025-01-15 12:00:00 |
| DATE | 4 bytes | 2025-01-15 |
| JSON | Variable | {“key”: “value”} |
| JSONB | Variable | {“key”: “value”} |
| VECTOR(n) | Variable | [0.1, 0.2, …] |
| UUID | 16 bytes | 550e8400-e29b-41d4-a716-446655440000 |
| ARRAY[T] | Variable | ARRAY[1,2,3] |
Vector Operations
Vector Type Definition:
-- Define columns with vector typecolumn_name VECTOR(dimensions)
-- Example: 1536-dimensional embedding (OpenAI)embedding VECTOR(1536)Distance Operators:
| Operator | Metric | Distance |
|---|---|---|
<=> | Cosine | 0 to 2 |
<#> | Euclidean | ≥ 0 |
<+> | Manhattan | ≥ 0 |
<~> | Dot Product | -∞ to ∞ |
Index Creation:
CREATE INDEX idx_name ON table USING hnsw (vector_column distance_ops)WITH ( m = 16, -- HNSW connection count ef_construction = 200, -- Build quality quantization = 'pq', -- Optional compression pq_subquantizers = 8 -- PQ settings);4. System Views & Monitoring
Information Schema Views
Available system views for introspection:
-- List all tablesSELECT * FROM information_schema.tables WHERE table_schema = 'public';
-- List all columnsSELECT * FROM information_schema.columns WHERE table_name = 'table_name';
-- List all indexesSELECT * FROM pg_indexes;
-- List all viewsSELECT * FROM information_schema.views WHERE table_schema = 'public';
-- Database statisticsSELECT * FROM pg_stat_user_tables;
-- Index statisticsSELECT * FROM pg_stat_user_indexes;
-- Vector index statisticsSELECT * FROM pg_vector_index_stats;
-- Branch informationSELECT * FROM pg_database_branches();
-- Materialized view stalenessSELECT * FROM pg_mv_staleness();
-- Current database sizeSELECT pg_database_size(current_database());5. Configuration Parameters
Configuration File (heliosdb.toml)
[storage]path = "./heliosdb-data" # Data directorycache_size = 536870912 # 512 MBcompression = "zstd" # zstd, lz4, nonein_memory = false # Disable persistence
[server]listen_addr = "0.0.0.0" # Listen addressport = 5432 # PostgreSQL portmax_connections = 100 # Connection pool sizeconnection_timeout_sec = 30 # Connection timeout
[performance]enable_simd = true # SIMD accelerationworker_threads = 4 # Query worker threadsparallel_query = true # Parallel query executionbuffer_pool_size = 268435456 # 256 MB
[encryption]enabled = false # Encryption at restalgorithm = "aes256-gcm" # AES-256-GCMkey_source = { environment = "HELIOSDB_KEY" }
[vector]enabled = true # Vector search supportpq_enabled = true # Product Quantizationpq_auto_compress = true # Auto-compress vectorspq_training_size = 10000 # PQ training sample size
[temporal]time_travel_enabled = true # Time-travel queriessnapshot_retention_days = 30 # Keep snapshots for N daysEnvironment Variables
# Server modeHELIOSDB_PORT=5432HELIOSDB_LISTEN_ADDR=0.0.0.0HELIOSDB_DATA_DIR=./data
# EncryptionHELIOSDB_ENCRYPTION_KEY=your-32-byte-key
# PerformanceHELIOSDB_WORKER_THREADS=4HELIOSDB_CACHE_SIZE=536870912
# LoggingHELIOSDB_LOG_LEVEL=info # trace, debug, info, warn, errorRUST_LOG=heliosdb_lite=debug6. Error Handling & Status Codes
PostgreSQL Error Codes
Standard PostgreSQL SQLSTATE codes are used:
| Code | Class | Meaning |
|---|---|---|
| 00000 | SUCCESS | Command successful |
| 01000 | WARNING | Warning (non-fatal) |
| 08000 | CONNECTION | Connection failure |
| 42P01 | UNDEFINED_TABLE | Table does not exist |
| 42P09 | UNDEFINED_OBJECT | Object does not exist |
| 23505 | UNIQUE_VIOLATION | Unique constraint violation |
| 23503 | FOREIGN_KEY | Foreign key constraint violation |
| 25001 | IN_FAILED_SQL_TRANSACTION | Transaction in failed state |
Structured Error Response
{ "error": { "code": "42P01", "message": "relation \"table_name\" does not exist", "detail": "Table 'table_name' was not found in schema 'public'", "severity": "ERROR", "sqlstate": "42P01" }}7. Performance Characteristics
Query Performance
| Operation | Throughput | Latency |
|---|---|---|
| Single row lookup | 100K+ QPS | <1ms |
| Full table scan | 500K-1M rows/sec | Variable |
| Vector search (k=10) | 1K-5K QPS | <100ms |
| Aggregation | 500K rows/sec | Variable |
| Join | 100K-500K rows/sec | Variable |
Compression Ratios
| Data Type | Method | Ratio |
|---|---|---|
| Numeric | ALP | 2-10x |
| String | FSST | 2-5x |
| Vector (768-dim) | PQ (u8) | 384x |
| Vector (1536-dim) | PQ (u8) | 768x |
Memory Usage
Base system: ~50 MBPer connection: ~1-2 MBCache (configurable): 50 MB to 4 GBIndex overhead: ~10-20% of data size8. Integration Patterns
Pattern 1: Embedded Database
Use Case: Local application storage
// Initializationlet db = EmbeddedDatabase::new("./app.helio")?;
// Execute querylet result = db.query("SELECT * FROM users", &[])?;
// Proper cleanupdrop(db); // Implicit SYNCLifecycle:
- Open database file
- Load schema and indexes
- Execute queries
- Sync data on drop
Concurrency: Single process with multi-threaded internal handling
Pattern 2: Server Mode
Use Case: Network accessible database
# Start serverheliosdb-lite start --port 5432 --data ./mydb
# Client connection (any language)Connection -> PostgreSQL wire protocol -> Server -> DatabaseLifecycle:
- Start server process
- Accept client connections
- Maintain connection pool
- Replicate transactions to disk
Concurrency: Multiple client connections (configurable max)
Pattern 3: Vector Search Integration
Use Case: Semantic search in applications
-- StorageCREATE TABLE documents ( id SERIAL, content TEXT, embedding VECTOR(1536));
CREATE INDEX idx_embedding ON documentsUSING hnsw (embedding vector_cosine_ops);
-- Insert with embeddingINSERT INTO documents VALUES (1, 'text', array[0.1, 0.2, ...]);
-- SearchSELECT id, content, embedding <=> ARRAY[0.15, 0.25, ...] AS distanceFROM documentsORDER BY distanceLIMIT 10;Pattern 4: Time-Travel Queries
Use Case: Historical data analysis
-- Current dataSELECT * FROM table;
-- Historical dataSELECT * FROM table AS OF TIMESTAMP '2025-01-01 00:00:00';
-- Audit trailSELECT id, action, timestampFROM audit_logWHERE occurred_at >= '2025-01-01';9. Compatibility Matrix
PostgreSQL Feature Support
| Feature | Support | Notes |
|---|---|---|
| SQL (DML/DDL) | 95%+ | Full support for common operations |
| Transactions | ✅ | MVCC, snapshot isolation |
| Indexes | ✅ | B-tree, HNSW |
| Extensions | ⚠️ | No PostGIS, pgvector |
| Replication | ⚠️ | Experimental |
| Procedures | ⚠️ | Limited |
| Triggers | ❌ | Not supported |
| Foreign Data | ❌ | Not supported |
ORM Compatibility
| Framework | Language | Status |
|---|---|---|
| SQLAlchemy | Python | ✅ Full |
| Diesel | Rust | ✅ Full |
| Prisma | TypeScript | ✅ Full |
| SQLx | Rust | ✅ Full |
| Entity Framework | .NET | ✅ Full |
10. Operational Parameters
Health Checks
-- Basic connectivitySELECT 1;
-- Database availabilitySELECT pg_database_size(current_database());
-- Index healthSELECT * FROM pg_stat_user_indexes;
-- Transaction activitySELECT * FROM pg_stat_activity;Maintenance Operations
-- VACUUM (cleanup dead tuples)VACUUM ANALYZE;
-- REINDEX (rebuild indexes)REINDEX DATABASE;
-- BackupBACKUP TO 's3://bucket/backup.sql';
-- RestoreRESTORE FROM 's3://bucket/backup.sql';Diagnostic Queries
-- Slow queries (>1 second)SELECT query, mean_time FROM pg_stat_statementsWHERE mean_time > 1000ORDER BY mean_time DESC;
-- Table bloatSELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as sizeFROM pg_tablesWHERE schemaname NOT IN ('pg_catalog', 'information_schema')ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Connection statusSELECT * FROM pg_stat_activity;11. SDK Usage Patterns
Python
import heliosdb_client
# Connectdb = heliosdb_client.Client( base_url="http://localhost:8080", api_key="key")
# Queryresult = await db.query("SELECT * FROM users WHERE id = $1", [1])
# Vector searchresults = await db.vector_search("documents", "hello world", top_k=5)
# Memorymemory = db.memory("session_id")await memory.add("user", "Hello")TypeScript
import { HeliosDBClient } from 'heliosdb-client';
const db = new HeliosDBClient({ baseUrl: 'http://localhost:8080', apiKey: 'key'});
const result = await db.query( 'SELECT * FROM users WHERE id = $1', [1]);
const vectors = await db.vectorSearch( 'documents', 'hello world', { topK: 5 });Go
import "github.com/heliosdb/go-client"
db := heliosdb.NewClient( "http://localhost:8080", "api-key",)
result, err := db.Query(ctx, "SELECT * FROM users", []interface{}{})
vectors, err := db.VectorSearch(ctx, "documents", "hello", &heliosdb.SearchOptions{ TopK: 5,})12. Schema Examples
Example 1: E-commerce
CREATE TABLE customers ( id SERIAL PRIMARY KEY, email TEXT UNIQUE, name TEXT, created_at TIMESTAMP DEFAULT NOW());
CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT, description TEXT, price DECIMAL(10,2), embedding VECTOR(1536));
CREATE TABLE orders ( id SERIAL PRIMARY KEY, customer_id INT REFERENCES customers, total DECIMAL(10,2), status TEXT, created_at TIMESTAMP DEFAULT NOW());
-- IndexesCREATE INDEX idx_customer_email ON customers(email);CREATE INDEX idx_order_customer ON orders(customer_id);CREATE INDEX idx_product_embedding ON products USING hnsw (embedding vector_cosine_ops);
-- Materialized viewCREATE MATERIALIZED VIEW customer_stats ASSELECT customer_id, COUNT(*) as order_count, SUM(total) as lifetime_valueFROM orders GROUP BY customer_id;Example 2: Time-Series Analytics
CREATE TABLE metrics ( id SERIAL PRIMARY KEY, sensor_id INT, metric_name TEXT, value FLOAT8, timestamp TIMESTAMP DEFAULT NOW(), metadata JSONB);
CREATE INDEX idx_sensor_time ON metrics(sensor_id, timestamp DESC);
CREATE MATERIALIZED VIEW hourly_avg ASSELECT sensor_id, metric_name, DATE_TRUNC('hour', timestamp) as hour, AVG(value) as avg_value, MAX(value) as max_value, MIN(value) as min_valueFROM metricsGROUP BY sensor_id, metric_name, DATE_TRUNC('hour', timestamp);
ALTER MATERIALIZED VIEW hourly_avg SET ( auto_refresh = true, staleness_threshold_sec = 300);13. Troubleshooting Guide for Automation
Connectivity Issues
Problem: Cannot connect to database
Diagnostic Steps:
-- 1. Test local connectivitySELECT 1;
-- 2. Check server statusSELECT version();
-- 3. Check active connectionsSELECT count(*) FROM pg_stat_activity;
-- 4. Check configurationSHOW listen_addresses;SHOW port;Performance Issues
Problem: Slow queries
Analysis:
-- Enable query loggingSET log_min_duration_statement = 1000;
-- Analyze query planEXPLAIN ANALYZE SELECT ...;
-- Check statisticsANALYZE table_name;
-- Check missing indexesSELECT schemaname, tablename, indexnameFROM pg_indexesWHERE schemaname NOT IN ('pg_catalog', 'information_schema');Resource Issues
Problem: Out of memory or disk space
Diagnosis:
-- Database sizeSELECT pg_database_size(current_database());
-- Table sizesSELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))FROM pg_tablesWHERE schemaname NOT IN ('pg_catalog', 'information_schema')ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Clear old dataDELETE FROM table_name WHERE created_at < '2024-01-01';VACUUM ANALYZE;