HeliosDB-Lite: HeliosCore & HTAP Features Guide
HeliosDB-Lite: HeliosCore & HTAP Features Guide
Version: 3.6.0 Last updated: 2026-04-19
HeliosDB-Lite is a 152,000 LOC Rust database engine with MVCC, ACID transactions, full SQL support, vector search, encryption, branching, and HTAP capabilities. This guide covers every major feature with practical SQL examples.
Table of Contents
- Columnar Storage
- HTAP Workload Classification
- Smart Scan / Predicate Pushdown
- Parallel Execution
- Materialized Views
- Partitioning
- EXPLAIN ANALYZE
- VACUUM / Maintenance
- Backup & Restore
- Point-in-Time Recovery (PITR)
- GRANT/REVOKE & RBAC
- Row-Level Security (RLS)
- Rate Limiting
- SSL Enforcement
- Plan Cache & Result Cache
- Time-Travel Queries
- Database Branching
- Full-Text Search
- Vector Search
- Transparent Data Encryption (TDE)
- Triggers & PL/pgSQL
- Prepared Statements
- Transactions & Savepoints
- UPSERT (ON CONFLICT)
- Window Functions
- JSONB Support
- Common Table Expressions (CTEs)
- System Views
- Connection Protocols
- Graph Adjacency Lists
- BM25 & Hybrid Search
- Arena Allocation
- Compiled Query Plans
- SQL Cursors
- Metrics & Health Endpoints
1. Columnar Storage
HeliosDB supports per-column storage mode selection, allowing HTAP workloads to mix row-oriented (OLTP) and column-oriented (OLAP) storage within the same table.
Storage modes available:
DEFAULT— Standard row-oriented storage (best for OLTP)DICTIONARY— Dictionary-encoded for low-cardinality strings (status codes, enums)CONTENT_ADDRESSED— Hash-based dedup for large values (documents, blobs)COLUMNAR— Column-grouped storage for analytics (aggregations, range scans)
Creating Tables with Storage Modes
CREATE TABLE analytics ( id INT PRIMARY KEY, status TEXT STORAGE DICTIONARY, bio TEXT STORAGE CONTENT_ADDRESSED, metric FLOAT8 STORAGE COLUMNAR);Altering Column Storage on Existing Tables
ALTER TABLE events ALTER COLUMN payload SET STORAGE COLUMNAR;What happens: HeliosDB migrates existing data to the new storage layout and logs the change. The EXPLAIN output reflects the storage mode.
EXPLAIN Reflects Columnar Scan
When all columns in a table use COLUMNAR mode, EXPLAIN shows Columnar Scan instead of Seq Scan:
CREATE TABLE metrics ( ts TIMESTAMP STORAGE COLUMNAR, value FLOAT8 STORAGE COLUMNAR, label TEXT STORAGE COLUMNAR);
EXPLAIN SELECT * FROM metrics;Example output:
Columnar Scan on metrics Output: ts, value, labelFor partially columnar tables, the scan type remains Seq Scan but the storage mode is tracked per column.
2. HTAP Workload Classification
Every query is automatically classified as OLTP, OLAP, or HTAP based on plan features (point queries, aggregations, row access patterns). This classification drives resource allocation and appears in EXPLAIN output.
Viewing Classification
EXPLAIN SELECT * FROM users WHERE id = 42;Example output:
Filtered Scan on users Filter: id = 42Workload Classification Type : OLTP Confidence : 90% OLTP Score : 0.85 OLAP Score : 0.15 Recommended Store: RowEXPLAIN SELECT department, COUNT(*), AVG(salary)FROM employeesGROUP BY department;Example output:
Aggregate Group By: department -> Seq Scan on employeesWorkload Classification Type : OLAP Confidence : 95% OLTP Score : 0.10 OLAP Score : 0.90 Recommended Store: ColumnResource Management
The ResourceManager allocates separate OLTP and OLAP quotas. OLTP queries receive lower-latency paths; OLAP queries may use parallel execution and larger memory budgets. Classification is automatic — no user intervention required.
3. Smart Scan / Predicate Pushdown
The optimizer pushes filter predicates down to the storage layer, creating FilteredScan nodes that evaluate conditions during the scan rather than after reading all rows.
How It Works
SELECT * FROM orders WHERE status = 'pending' AND amount > 1000;Without pushdown:
Filter (status = 'pending' AND amount > 1000) -> Seq Scan on ordersWith pushdown (automatic):
Filtered Scan on orders Storage Filter: status = 'pending' AND amount > 1000Supported pushdown predicates: =, !=, <, <=, >, >=, BETWEEN, LIKE, IS NULL, IS NOT NULL, IN, AND, OR.
Projection Pushdown
The optimizer also pushes column projections, reading only the requested columns from storage:
SELECT name, email FROM users WHERE active = true;The storage layer reads only the name, email, and active columns, skipping all others.
Aggregate Pushdown
For simple aggregations, the storage layer can compute results directly:
SELECT COUNT(*) FROM large_table;The HeliosCore storage tier computes the count during the scan without materializing rows.
4. Parallel Execution
HeliosDB uses rayon-based work-stealing parallelism for large scans, hash joins, aggregations, and sorting.
Automatic Parallel Scan
Tables with 64 or more rows using default storage mode automatically use parallel scanning:
-- Automatic: no configuration needed for tables > 64 rowsSELECT COUNT(*) FROM large_table;The parallel scanner distributes data across rayon worker threads, achieves up to 2.3x scaling, and supports predicate filtering during the parallel phase.
Parallel Operations
| Operation | Parallelism | Trigger |
|---|---|---|
| Table Scan | par_iter() via rayon | >= 64 rows |
| Hash Join | Parallel build + probe | Automatic |
| Aggregate | Parallel partial aggregation | With parallel scan |
| Sort | Parallel merge sort | With parallel scan |
Monitoring
SELECT * FROM pg_parallel_exec_stats;Example output:
total_scans | parallel_scans | sequential_scans | tuples_scanned | avg_parallelism | peak_throughput ----------- | -------------- | ---------------- | -------------- | --------------- | --------------- 1024 | 847 | 177 | 52000000 | 3.7 | 912000.05. Materialized Views
Materialized views store precomputed query results for fast reads. The optimizer can transparently rewrite queries to read from the materialized view data table instead of the base tables.
Creating and Refreshing
CREATE MATERIALIZED VIEW monthly_sales ASSELECT date_trunc('month', order_date) AS month, SUM(amount) AS total, COUNT(*) AS order_countFROM ordersGROUP BY date_trunc('month', order_date);-- Refresh with latest dataREFRESH MATERIALIZED VIEW monthly_sales;Transparent Query Rewrite
After creating the materialized view, the optimizer may rewrite matching queries to use it:
-- User writes:SELECT date_trunc('month', order_date) AS month, SUM(amount) AS totalFROM ordersGROUP BY date_trunc('month', order_date);
-- Optimizer rewrites to:-- Seq Scan on __mv_monthly_salesThis is automatic. Use EXPLAIN to verify rewrite:
EXPLAIN SELECT date_trunc('month', order_date) AS month, SUM(amount)FROM orders GROUP BY date_trunc('month', order_date);Listing Materialized Views
SELECT * FROM pg_matviews;Staleness Monitoring
SELECT * FROM pg_mv_staleness;Returns the last refresh time and whether the underlying data has changed since.
6. Partitioning
HeliosDB supports HASH, RANGE, and LIST partitioning with partition-aware scanning and pruning in the optimizer.
Partition by Hash
CREATE TABLE events ( id INT, created_at TIMESTAMP, data TEXT) PARTITION BY HASH (id);Partition by Range
CREATE TABLE logs ( id INT, log_date DATE, message TEXT) PARTITION BY RANGE (log_date);Partition by List
CREATE TABLE orders ( id INT, region TEXT, amount FLOAT8) PARTITION BY LIST (region);Partition Pruning
The optimizer prunes irrelevant partitions during query planning:
-- Only scans the partition containing region = 'EU'SELECT * FROM orders WHERE region = 'EU';7. EXPLAIN ANALYZE
HeliosDB provides comprehensive query plan introspection with multiple formats and extensions beyond standard PostgreSQL.
Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE age > 25;Example output:
Filtered Scan on users Storage Filter: age > 25 Estimated Rows: 150 Estimated Cost: 25.0Workload Classification Type : OLAP Confidence : 80% OLTP Score : 0.30 OLAP Score : 0.70 Recommended Store: ColumnEXPLAIN ANALYZE (with execution)
Runs the query and reports actual row counts and timing:
EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25;Example output:
Filtered Scan on users (actual rows=147, time=1.234ms) Storage Filter: age > 25 Estimated Rows: 150Workload Classification Type : OLAP Confidence : 80%Output Formats
-- JSON formatEXPLAIN (FORMAT JSON) SELECT * FROM users;
-- YAML formatEXPLAIN (FORMAT YAML) SELECT * FROM users;
-- Tree formatEXPLAIN (FORMAT TREE) SELECT * FROM users;
-- Text format (default)EXPLAIN (FORMAT TEXT) SELECT * FROM users;Full Options
EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM users WHERE id = 1;EXPLAIN (ANALYZE, FORMAT JSON) SELECT * FROM users;HeliosDB-Specific Extensions
-- Storage layer analysis (bloom filters, zone maps, compression)EXPLAIN (STORAGE) SELECT * FROM orders;
-- AI-powered explanationEXPLAIN (AI) SELECT * FROM users WHERE status = 'active';
-- Why-Not analysis: why certain optimizations were not appliedEXPLAIN (ANALYZE, STORAGE, WHY_NOT) SELECT * FROM orders;
-- Trace mode: full lifecycle tracingEXPLAIN (TRACE) SELECT * FROM users;8. VACUUM / Maintenance
VACUUM reclaims storage from deleted/updated rows and optionally updates table statistics.
Basic VACUUM
-- Clean expired version data and empty tombstones across all tablesVACUUM;Example output:
command | versions_removed | tombstones_removed | bytes_freed | duration_ms | is_full | is_analyze ------- | ---------------- | ------------------ | ----------- | ----------- | ------- | ---------- VACUUM | 1203 | 45 | 98304 | 12 | false | falseVACUUM FULL
Aggressive cleanup pass:
VACUUM FULL;VACUUM ANALYZE
Clean up and update statistics for the query optimizer:
-- All tablesVACUUM ANALYZE;
-- Specific tableVACUUM ANALYZE users;When ANALYZE is included, HeliosDB scans each table to compute column histograms and cardinality estimates used by the optimizer.
Maintenance Scheduler
HeliosDB includes a MaintenanceScheduler that can automatically dispatch:
- Analyze — update statistics
- Vacuum — reclaim space
- Checkpoint — flush WAL to stable storage
- Compaction — merge storage segments
9. Backup & Restore
HeliosDB provides SQL-level backup and restore with support for full and incremental backups.
Full Backup
BACKUP DATABASE;Example output:
command | backup_id | type | keys | size_bytes | path | duration_ms ------- | ---------------- | ---- | ----- | ---------- | ----------------------------- | ----------- BACKUP | bk_1711234567890 | full | 15234 | 2457600 | /tmp/bk_1711234567890.hldb | 45Incremental Backup
Only backs up data changed since the last backup:
BACKUP DATABASE INCREMENTAL;Backup to Specific Path
BACKUP DATABASE TO '/backups/mydb_20260324.hldb';Restore from Backup
RESTORE DATABASE FROM 'bk_1711234567890';Restore to a New Branch
Non-destructive restore — creates a branch with the backup data:
RESTORE DATABASE FROM 'bk_1711234567890' TO BRANCH 'recovery';Incremental Chain Restore
When restoring an incremental backup, HeliosDB automatically walks the backup chain back to the base full backup and applies all deltas in order.
10. Point-in-Time Recovery (PITR)
Fork the database state from any past timestamp into a new branch.
Recovery Syntax
RECOVER DATABASE TO TIMESTAMP '2026-03-21T10:00:00' AS BRANCH 'before_incident';What happens:
- Creates a new branch named
before_incident - Scans every user table at the specified MVCC snapshot timestamp
- Writes the historical rows into the new branch
- Original data remains untouched
Example output:
command | branch | timestamp | tables_recovered | rows_recovered | duration_ms ------- | ---------------- | -------------- | ---------------- | -------------- | ----------- RECOVER | before_incident | 1711018800000 | 8 | 45230 | 234Auto-Named Branch
If no branch name is specified, HeliosDB generates one:
RECOVER DATABASE TO TIMESTAMP '2026-03-20T12:00:00';-- Creates branch: pitr_171093600000011. GRANT/REVOKE & RBAC
HeliosDB implements a full role-based access control system with table-level and column-level privilege grants, role hierarchy, and WITH GRANT OPTION.
Creating Roles
CREATE ROLE analyst WITH LOGIN;CREATE ROLE manager WITH LOGIN;Granting Privileges
-- Table-levelGRANT SELECT ON sales TO analyst;GRANT INSERT, UPDATE ON sales TO manager;
-- All privilegesGRANT ALL ON orders TO manager;
-- With grant option (can re-grant to others)GRANT SELECT ON reports TO manager WITH GRANT OPTION;Column-Level Grants
GRANT SELECT (name, email) ON users TO analyst;Role Hierarchy
GRANT analyst TO alice; -- alice inherits analyst's privilegesGRANT manager TO bob;Revoking Privileges
REVOKE INSERT ON sales FROM analyst;REVOKE ALL ON orders FROM manager;Introspection
-- View all grantsSELECT * FROM helios_grants;Example output:
table_name | grantee | privilege | with_grant_option ---------- | -------- | --------- | ----------------- sales | analyst | SELECT | false sales | manager | INSERT | false sales | manager | UPDATE | false reports | manager | SELECT | true-- View all rolesSELECT * FROM helios_roles;Example output:
role_name | can_login | parent_roles | rate_limit | ssl_required --------- | --------- | -------------- | ---------- | ------------ analyst | true | | 0 | false manager | true | | 0 | false alice | true | analyst | 0 | falsePrivilege Enforcement
Privileges are checked on every query() and execute() call. A user without SELECT on a table receives:
permission denied: SELECT privilege required on table 'sales' for user 'guest'12. Row-Level Security (RLS)
RLS restricts which rows a user can see based on policy conditions.
Enabling RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;Creating Policies
CREATE POLICY user_orders ON ordersFOR SELECTUSING (user_id = current_user);This means each user can only see their own orders.
Viewing Policies
SELECT * FROM pg_rls_policies;Example output:
table_name | policy_name | command | roles | using_expr ---------- | ----------- | ------- | ------ | ---------------------------- orders | user_orders | SELECT | public | (user_id = current_user)Dropping Policies
DROP POLICY user_orders ON orders;13. Rate Limiting
Per-role request rate limiting prevents abuse and ensures fair resource distribution.
Creating a Rate-Limited Role
CREATE ROLE api_user WITH LOGIN RATE_LIMIT 100;This limits api_user to 100 requests per minute.
Viewing Rate Limits
SELECT * FROM helios_user_quotas;Example output:
role_name | rate_limit | requests_used | window_start --------- | ---------- | ------------- | ---------------------- api_user | 100 | 47 | 2026-03-24T10:00:00ZWhen the limit is exceeded, requests return:
rate limit exceeded: api_user is limited to 100 requests per minute14. SSL Enforcement
Roles can require encrypted connections, rejecting any plaintext connection attempt.
Creating an SSL-Required Role
CREATE ROLE secure_user WITH LOGIN SSL_REQUIRED;Behavior
When secure_user connects without TLS, the connection is refused:
SSL required: user 'secure_user' must connect via TLSMonitoring SSL Connections
SELECT * FROM pg_stat_ssl;Example output:
pid | ssl | version | cipher | bits | compression --- | --- | ------- | --------------- | ---- | ----------- 1 | t | TLSv1.3 | TLS_AES_256_GCM | 256 | off15. Plan Cache & Result Cache
HeliosDB employs a 3-tier caching system to accelerate repeated queries.
Tier 1: Plan Cache
Repeated SQL queries skip parsing and planning. The cache is keyed by a hash of the SQL text. DDL statements (CREATE, ALTER, DROP) automatically invalidate the plan cache.
Tier 2: Result Cache
Query results are cached and returned directly for identical queries against unchanged data. DML operations (INSERT, UPDATE, DELETE) invalidate relevant cache entries:
INSERT INTO usersinvalidates caches for theuserstableREFRESH MATERIALIZED VIEWinvalidates caches for the base and MV tables- DDL operations invalidate all caches
Tier 3: Statistics Cache
Table statistics (histograms, cardinality estimates) are cached and invalidated on relevant DML.
Cache Behavior
All caching is automatic and transparent. No SQL interface is needed. The caches are consulted on every query() call:
query("SELECT * FROM users WHERE id = 1") -> Check result cache (hit? return immediately) -> Check plan cache (hit? skip parse+plan) -> Execute and store results in cache16. Time-Travel Queries
Query the state of any table at a past point in time using MVCC snapshots.
Syntax
SELECT * FROM users AS OF TIMESTAMP '2026-03-20T12:00:00';What happens: HeliosDB reads the MVCC version chain and returns the row versions visible at the specified timestamp.
Comparing States
-- What did the data look like yesterday?SELECT * FROM orders AS OF TIMESTAMP '2026-03-23T00:00:00';
-- Current data for comparisonSELECT * FROM orders;Use Cases
- Audit: “What was this row before it was updated?”
- Debugging: “What did the data look like when the bug was reported?”
- Compliance: Prove the state of data at a regulatory checkpoint
17. Database Branching
HeliosDB supports Git-like branching of the entire database state, enabling safe experimentation and parallel development.
Creating a Branch
CREATE BRANCH dev;Creates a copy-on-write branch from the current state. Data is shared until modified.
Querying a Branch
SELECT * FROM users ON BRANCH dev;Merging Branches
MERGE BRANCH dev INTO main;Applies changes from dev back into main.
Dropping a Branch
DROP BRANCH old_feature;Listing Branches
SELECT * FROM pg_database_branches;Example output:
branch_name | branch_id | parent_id | created_at ----------- | --------- | --------- | ---------------------- main | 1 | | 2026-01-01T00:00:00Z dev | 2 | 1 | 2026-03-24T10:00:00ZBranch + Time-Travel Combo
-- Create a branch from a past snapshotCREATE BRANCH recovery AS OF TIMESTAMP '2026-03-20T12:00:00';18. Full-Text Search
HeliosDB supports PostgreSQL-compatible full-text search using tsvector and tsquery.
Basic Full-Text Query
SELECT title, bodyFROM articlesWHERE to_tsvector(body) @@ to_tsquery('database & performance');Operators:
&— AND (both terms must appear)|— OR (either term)!— NOT (term must not appear)
Combining with Other Filters
SELECT *FROM articlesWHERE to_tsvector(body) @@ to_tsquery('rust | performance') AND published = trueORDER BY created_at DESCLIMIT 10;19. Vector Search
HeliosDB provides native vector similarity search with HNSW indexes, Product Quantization, and SIMD acceleration.
Creating a Vector Column
CREATE TABLE embeddings ( id INT PRIMARY KEY, title TEXT, vec VECTOR(128));Inserting Vectors
INSERT INTO embeddings (id, title, vec)VALUES (1, 'example', '[0.1, 0.2, 0.3, ...]');Creating an HNSW Index
CREATE INDEX idx_embeddings_vec ON embeddings USING hnsw (vec);K-Nearest Neighbor Search
-- L2 distance (Euclidean)SELECT id, title, vec <-> '[0.1, 0.2, 0.3, ...]' AS distanceFROM embeddingsORDER BY vec <-> '[0.1, 0.2, 0.3, ...]'LIMIT 10;Distance Operators
| Operator | Distance Function |
|---|---|
<-> | L2 (Euclidean) distance |
<#> | Negative inner product |
<=> | Cosine distance |
Distance Functions
SELECT cosine_similarity(vec, '[0.1, 0.2, ...]') FROM embeddings;SELECT l2_distance(vec, '[0.1, 0.2, ...]') FROM embeddings;SELECT inner_product(vec, '[0.1, 0.2, ...]') FROM embeddings;Vector Index Statistics
SELECT * FROM pg_vector_index_stats;20. Transparent Data Encryption (TDE)
All data at rest is encrypted using AES-256-GCM when TDE is enabled.
Configuration
TDE is configured at startup via Config or TOML:
[encryption]enabled = trueOr programmatically:
let mut config = Config::default();config.encryption.enabled = true;let db = EmbeddedDatabase::open_with_config(config)?;How It Works
- Algorithm: AES-256-GCM (authenticated encryption)
- Key management:
KeyManagerhandles key generation, rotation, and storage - Scope: All
data:*keys in the storage backend are encrypted before writing and decrypted on read - PBKDF2: 100,000 iterations (NIST SP 800-132 compliant) for key derivation
- Performance: Encryption/decryption is transparent to the SQL layer
Monitoring
SELECT * FROM pg_encryption_status;SELECT * FROM pg_table_encryption;SELECT * FROM pg_segment_encryption;21. Triggers & PL/pgSQL
HeliosDB supports triggers and PL/pgSQL stored functions with loops, cursors, conditionals, and exception handling.
Creating a PL/pgSQL Function
CREATE FUNCTION audit_trigger() RETURNS TRIGGER AS $$BEGIN INSERT INTO audit_log (table_name, action, changed_at) VALUES (TG_TABLE_NAME, TG_OP, NOW()); RETURN NEW;END;$$ LANGUAGE plpgsql;Creating a Trigger
CREATE TRIGGER audit_changes AFTER INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION audit_trigger();Trigger Timing and Events
| Timing | Events |
|---|---|
BEFORE | INSERT, UPDATE, DELETE, TRUNCATE |
AFTER | INSERT, UPDATE, DELETE, TRUNCATE |
INSTEAD OF | INSERT, UPDATE, DELETE (for views) |
Trigger Granularity
-- Per-row triggerCREATE TRIGGER row_trigger AFTER INSERT ON orders FOR EACH ROW EXECUTE FUNCTION my_func();
-- Per-statement triggerCREATE TRIGGER stmt_trigger AFTER INSERT ON orders FOR EACH STATEMENT EXECUTE FUNCTION my_func();Constraint Triggers
CREATE CONSTRAINT TRIGGER check_balance AFTER UPDATE ON accounts FOR EACH ROW EXECUTE FUNCTION validate_balance();PL/pgSQL Features
CREATE FUNCTION calculate_bonus(emp_id INT) RETURNS FLOAT8 AS $$DECLARE emp_salary FLOAT8; bonus FLOAT8;BEGIN SELECT salary INTO emp_salary FROM employees WHERE id = emp_id;
IF emp_salary > 100000 THEN bonus := emp_salary * 0.10; ELSIF emp_salary > 50000 THEN bonus := emp_salary * 0.15; ELSE bonus := emp_salary * 0.20; END IF;
RETURN bonus;END;$$ LANGUAGE plpgsql;Dropping Triggers
DROP TRIGGER audit_changes ON users;DROP TRIGGER audit_changes ON users CASCADE;22. Prepared Statements
Prepared statements separate parsing/planning from execution, improving performance for repeated queries and preventing SQL injection.
PREPARE
PREPARE get_user AS SELECT * FROM users WHERE id = $1;EXECUTE
EXECUTE get_user(42);EXECUTE get_user(99);DEALLOCATE
-- Single statementDEALLOCATE get_user;
-- All prepared statementsDEALLOCATE ALL;How It Works
Prepared statements are stored in an Arc<RwLock<HashMap<String, LogicalPlan>>> and persist for the session. The plan is compiled once and re-executed with different parameters via execute_plan_with_params().
23. Transactions & Savepoints
HeliosDB provides full ACID transactions with savepoint support.
Basic Transaction
BEGIN;INSERT INTO accounts (id, balance) VALUES (1, 1000);UPDATE accounts SET balance = balance - 100 WHERE id = 1;COMMIT;Rollback
BEGIN;DELETE FROM important_data;-- Oops, undo everythingROLLBACK;Savepoints
Savepoints allow partial rollbacks within a transaction:
BEGIN;INSERT INTO orders (id, amount) VALUES (1, 100);SAVEPOINT before_discount;UPDATE orders SET amount = 50 WHERE id = 1;-- Undo only the updateROLLBACK TO SAVEPOINT before_discount;-- The INSERT is still intactCOMMIT;Release Savepoint
BEGIN;SAVEPOINT sp1;INSERT INTO log (msg) VALUES ('step 1');RELEASE SAVEPOINT sp1; -- sp1 no longer available for rollbackCOMMIT;24. UPSERT (ON CONFLICT)
Insert or update in a single atomic operation.
ON CONFLICT DO UPDATE
INSERT INTO users (id, name, email, login_count)VALUES (1, 'Alice', 'alice@example.com', 1)ON CONFLICT (id) DO UPDATESET login_count = users.login_count + 1, name = EXCLUDED.name;If id = 1 already exists, the login_count is incremented and name is updated. Otherwise, the row is inserted.
ON CONFLICT DO NOTHING
INSERT INTO tags (name)VALUES ('rust')ON CONFLICT (name) DO NOTHING;Silently skips the insert if the tag already exists.
25. Window Functions
HeliosDB supports 10+ window functions for analytics queries.
Available Functions
| Function | Description |
|---|---|
ROW_NUMBER() | Sequential number per partition |
RANK() | Rank with gaps for ties |
DENSE_RANK() | Rank without gaps |
LAG(expr, offset, default) | Value from previous row |
LEAD(expr, offset, default) | Value from next row |
FIRST_VALUE(expr) | First value in window frame |
LAST_VALUE(expr) | Last value in window frame |
NTH_VALUE(expr, n) | Nth value in window frame |
NTILE(n) | Distribute rows into n buckets |
PERCENT_RANK() | Relative rank (0 to 1) |
CUME_DIST() | Cumulative distribution |
SUM/AVG/COUNT/MIN/MAX | Aggregate window functions |
Examples
-- Rank employees by salary within each departmentSELECT name, department, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rankFROM employees;-- Running total of salesSELECT order_date, amount, SUM(amount) OVER (ORDER BY order_date) AS running_totalFROM orders;-- Compare each sale to the previous oneSELECT order_date, amount, LAG(amount, 1, 0) OVER (ORDER BY order_date) AS prev_amount, amount - LAG(amount, 1, 0) OVER (ORDER BY order_date) AS changeFROM orders;26. JSONB Support
HeliosDB provides 20+ JSONB functions for document-style queries.
Storing JSON Data
CREATE TABLE events ( id INT PRIMARY KEY, data JSONB);
INSERT INTO events (id, data)VALUES (1, '{"type": "click", "page": "/home", "meta": {"browser": "Firefox"}}');Extraction Functions
-- Extract a nested valueSELECT jsonb_extract_path(data, 'meta', 'browser') FROM events;-- Result: "Firefox"
-- Extract as textSELECT jsonb_extract_path_text(data, 'type') FROM events;-- Result: clickArray Functions
SELECT jsonb_array_elements('[1, 2, 3]'::JSONB);SELECT jsonb_array_elements_text('["a", "b", "c"]'::JSONB);SELECT jsonb_array_length('[1, 2, 3]'::JSONB);-- Result: 3Object Functions
SELECT jsonb_object_keys('{"a": 1, "b": 2}'::JSONB);-- Results: a, b
SELECT jsonb_typeof('"hello"'::JSONB);-- Result: string
SELECT jsonb_typeof('42'::JSONB);-- Result: number27. Common Table Expressions (CTEs)
HeliosDB supports both regular and recursive CTEs.
Regular CTE
WITH active_users AS ( SELECT * FROM users WHERE active = true)SELECT name, email FROM active_users WHERE created_at > '2026-01-01';Recursive CTE
WITH RECURSIVE hierarchy(id, name, manager_id, level) AS ( -- Base case: top-level managers SELECT id, name, manager_id, 0 FROM employees WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees under each manager SELECT e.id, e.name, e.manager_id, h.level + 1 FROM employees e JOIN hierarchy h ON e.manager_id = h.id)SELECT * FROM hierarchy ORDER BY level, name;Number Sequence
WITH RECURSIVE nums(n) AS ( SELECT 1 UNION ALL SELECT n + 1 FROM nums WHERE n < 10)SELECT n FROM nums;28. System Views
HeliosDB provides 60+ system views across three registries for comprehensive introspection.
Core Catalog Views
| View | Description |
|---|---|
pg_tables | List all user tables |
pg_views | List views |
pg_indexes | List indexes (including HNSW) |
pg_columns | Column definitions |
pg_constraints | Constraints (PK, FK, UNIQUE, CHECK) |
pg_settings | Runtime configuration parameters |
Session & Activity
| View | Description |
|---|---|
pg_stat_activity | Active queries and sessions |
pg_stat_database | Database-level statistics |
pg_stat_ssl | SSL/TLS connection statistics |
pg_authid | Authentication identities |
MVCC & Branching
| View | Description |
|---|---|
pg_branches / pg_database_branches | List all database branches |
pg_snapshots | Active MVCC snapshots |
pg_transaction_map | Transaction ID to timestamp mapping |
pg_scn_map / pg_current_scn | System Change Number tracking |
Optimizer & Execution
| View | Description |
|---|---|
pg_stat_optimizer | Query optimizer statistics |
pg_parallel_exec_stats | Parallel execution metrics |
pg_predicate_pushdown_stats | Predicate pushdown performance |
Materialized Views
| View | Description |
|---|---|
pg_matviews | List materialized views |
pg_mv_staleness | MV freshness tracking |
pg_mv_cpu_usage | MV refresh CPU usage |
Security & RBAC
| View | Description |
|---|---|
helios_grants | All privilege grants |
helios_roles | Role definitions and hierarchy |
helios_user_quotas | Per-user rate limits and usage |
pg_rls_policies | Row-Level Security policies |
Storage & HTAP
| View | Description |
|---|---|
pg_htap_layout_stats | HTAP storage layout statistics |
helios_workload_patterns | Column access patterns for HTAP tuning |
pg_data_store_types | Available storage types |
pg_compression_stats | Compression ratios and statistics |
heliosdb_compression_stats | Detailed compression metrics |
heliosdb_pattern_stats | Data pattern analysis |
heliosdb_compression_events | Compression event log |
SMFI (Speculative Multi-Filter Indexing)
| View | Description |
|---|---|
helios_filter_metrics | Filter performance metrics |
helios_filter_recommendations | Suggested filter improvements |
helios_filter_budget | Storage budget for filters |
helios_filter_build_queue | Filter build queue status |
helios_filter_types | Available filter types |
pg_smfi_status | SMFI system status |
pg_smfi_table_stats | Per-table SMFI statistics |
pg_speculative_filters | Active speculative filters |
High Availability & Proxy
| View | Description |
|---|---|
helios_topology | Cluster topology with health info |
helios_node_aliases | Node alias mappings |
helios_ha_status | HA cluster status summary |
pg_stat_replication | Replication status |
pg_stat_progress_vacuum | Maintenance progress |
helios_sync_status | HeliosDB sync metrics |
helios_proxy_pools | Connection pool statistics |
helios_proxy_status | Proxy status summary |
helios_proxy_cache_stats | Query cache statistics |
helios_proxy_routing_stats | Query routing statistics |
helios_proxy_lag_stats | Replica lag statistics |
helios_proxy_rate_limit_stats | Rate limiting statistics |
helios_proxy_circuit_breaker_stats | Circuit breaker statistics |
helios_proxy_analytics_stats | Query analytics |
helios_proxy_auth_stats | Authentication statistics |
helios_proxy_rewriter_stats | Query rewriting statistics |
helios_proxy_plugin_stats | WASM plugin statistics |
helios_proxy_graphql_stats | GraphQL gateway statistics |
helios_proxy_schema_routing_stats | Schema-aware routing stats |
Encryption
| View | Description |
|---|---|
pg_encryption_status | TDE status overview |
pg_table_encryption | Per-table encryption status |
pg_segment_encryption | Per-segment encryption details |
Operational
| View | Description |
|---|---|
helios_query_history | Query execution history |
helios_table_memory_stats | Per-table memory usage |
helios_transaction_stats | Transaction statistics |
pg_maintenance_stats | Maintenance task statistics |
pg_access_tracker_stats | Access pattern tracking |
HeliosCore Storage Tier
| View | Description |
|---|---|
pg_helioscore_api_stats | HeliosCore API call statistics |
pg_block_file_stats | Block file I/O statistics |
pg_wal_stats | WAL write statistics |
pg_segment_stats | Storage segment statistics |
pg_shard_stats | Shard distribution statistics |
pg_staging_area_stats | Staging area utilization |
pg_aux_structure_stats | Auxiliary structure (bloom, zone map) stats |
Multi-Tenancy
| View | Description |
|---|---|
pg_tenant_usage | Per-tenant resource usage |
pg_cdc_events | Change Data Capture events |
pg_distribcache_stats | Distributed cache statistics |
Example Usage
-- Check active sessionsSELECT * FROM pg_stat_activity;
-- View optimizer statisticsSELECT * FROM pg_stat_optimizer;
-- Check HTAP workload patternsSELECT * FROM helios_workload_patterns;
-- Monitor compressionSELECT * FROM pg_compression_stats;
-- Check encryption statusSELECT * FROM pg_encryption_status;29. Connection Protocols
HeliosDB supports three connection protocols.
PostgreSQL Wire Protocol (Port 5432)
Full PostgreSQL wire protocol compatibility with SCRAM-SHA-256 authentication.
Features:
- Simple and extended query protocols
- SCRAM-SHA-256 authentication (PBKDF2 with 100,000 iterations)
- SSL/TLS support
- Per-role SSL enforcement
Connect with psql:
psql -h localhost -p 5432 -U myuser -d heliosdbConnect with any PostgreSQL driver (Python, Go, Java, Node.js, etc.):
import psycopg2conn = psycopg2.connect( host="localhost", port=5432, user="myuser", password="mypassword", dbname="heliosdb")REST API
HTTP-based query interface with JSON request/response format. Includes rate limiting middleware, JWT authentication, and Supabase-compatible PostgREST endpoints.
Endpoints:
POST /query— Execute SQL queriesPOST /execute— Execute DML statementsGET /tables— List tables- PostgREST-compatible data endpoints
gRPC (Storage Tier)
Protocol Buffers-based interface for distributed storage operations. Used internally by the HeliosCore storage tier for node-to-node communication.
Default port: 50051
Operations: Scan, Get, Put, Delete, BatchWrite, CreateSnapshot, and more.
Architecture Summary
Clients | +-- PostgreSQL Wire Protocol (SCRAM-SHA-256) +-- REST API (JSON / JWT) +-- gRPC (Protobuf) |SQL Engine | +-- Parser (sqlparser-rs) +-- Planner (logical plan, 69+ node types) +-- Optimizer (8 rule modules: predicate pushdown, join reorder, | MV rewrite, partition pruning, projection, subquery, | storage pushdown, expression simplification) +-- Executor (Volcano model, InstrumentedOperator for ANALYZE) |Storage Engine | +-- MVCC (snapshot isolation, read-committed) +-- WAL (write-ahead log with checkpointing) +-- NativeBackend (O_DIRECT I/O) +-- ART Indexes (primary + secondary) +-- HNSW Vector Indexes (with Product Quantization) +-- Graph Adjacency Lists (BFS, Dijkstra, A*) +-- BM25 Inverted Index + Hybrid Search (RRF, MMR) +-- Bloom Filters + Zone Maps +-- Compression (Zstd, LZ4, Snappy, dictionary) +-- Columnar / Dictionary / Content-Addressed storage +-- Branching (Git-like, copy-on-write) +-- Time-Travel (MVCC version chains) |Security | +-- TDE (AES-256-GCM, key rotation) +-- RBAC (GRANT/REVOKE, role hierarchy) +-- RLS (row-level security policies) +-- Rate Limiting (per-role) +-- SSL Enforcement (per-role) +-- SCRAM-SHA-256 (PBKDF2, 100K iterations) |High Availability (feature-gated) | +-- WAL Streaming Replication +-- Multi-Primary Replication +-- Sharding +-- HeliosProxy (13 features: caching, routing, rate limiting, circuit breaker, analytics, auth, query rewriting, WASM plugins, GraphQL gateway, schema routing, etc.)30. Graph Adjacency Lists
HeliosDB includes a property-graph layer (src/graph/) for modeling relationships as directed, typed, weighted edges between nodes.
Data Model
- Nodes: 64-bit unsigned ID, optional label and properties map
- Edges: Directed, typed (label string), optionally weighted (default 1.0)
- Storage: DashMap-based concurrent adjacency lists (out-edges and in-edges per node)
API
use heliosdb_lite::graph::{GraphStore, Edge};
let g = GraphStore::in_memory();g.add_edge(Edge::new(1, 2, "FOLLOWS"));g.add_edge(Edge::with_weight(2, 3, "KNOWS", 0.8));Traversal Algorithms
| Algorithm | Function | Description |
|---|---|---|
| BFS | bfs(store, source, max_depth) | Breadth-first, returns visit order + depth map |
| Dijkstra | dijkstra(store, source) | Single-source shortest paths (non-negative weights) |
| A* | a_star(store, source, target, heuristic) | Heuristic-guided shortest path |
Planned SQL Functions
The graph module declares four SQL functions for future wiring:
| Function | Description |
|---|---|
graph_bfs(source, max_depth) | BFS traversal returning reachable node IDs |
graph_shortest_path(source, target) | Dijkstra shortest path |
graph_a_star(source, target) | A* with user-supplied heuristic column |
graph_neighbors(node) | Immediate out-neighbours as (to, label, weight) |
Status: SQL function wiring is deferred until the storage-backed EdgeStore implementations are in place. The in-memory graph store is fully functional for API use and testing.
31. BM25 & Hybrid Search
HeliosDB provides BM25 lexical search and hybrid (BM25 + vector) search for RAG and full-text retrieval workloads.
BM25 Inverted Index
The Bm25Index implements the Okapi BM25 ranking function with configurable parameters:
- k1 (default 1.2): Term frequency saturation
- b (default 0.75): Document length normalization
use heliosdb_lite::search::{Bm25Index, Bm25Params, tokenize};
let mut idx = Bm25Index::new(Bm25Params::default());idx.add_document(1, "the quick brown fox");idx.add_document(2, "the lazy brown dog");
let results = idx.search("quick fox", 10);// Returns ScoredDoc { doc_id: 1, score: ... }Unicode-Aware Tokenizer
The tokenizer performs lowercase normalization and ASCII folding (e.g., accented characters to ASCII equivalents), making searches accent-insensitive.
Bloom Filter Skip
The filter_skip module adapts the existing Bloom filter infrastructure to short-circuit term lookups. If the Bloom filter says a term does not exist in a posting list, the lookup is skipped entirely — eliminating I/O for non-existent terms.
Hybrid Search
Combines BM25 text scores with vector cosine similarity scores using two fusion strategies:
| Strategy | Description |
|---|---|
| Linear | w_bm25 * bm25_score + w_vec * vec_score |
| RRF | Reciprocal Rank Fusion (k=60 default) |
use heliosdb_lite::search::{hybrid_search, HybridConfig, HybridScore};
let results = hybrid_search(&bm25_index, "search query", &vector_results, HybridConfig::default());Rerankers
| Reranker | Function | Use Case |
|---|---|---|
| RRF | rrf(lists, params, limit) | Combine multiple ranked lists without score calibration |
| MMR | mmr(candidates, lambda, sim_fn, limit) | Diversify results by penalizing redundancy |
MMR uses the greedy formula: lambda * relevance(d) - (1 - lambda) * max_sim(d, selected). A lambda of 1.0 reduces to plain relevance ordering; 0.0 maximizes diversity.
32. Arena Allocation
The RequestArena (src/runtime/arena.rs) provides per-request bump allocation using bumpalo, eliminating per-object allocator overhead for transient buffers.
Key Properties
- Zero-cost deallocation: Dropping the arena frees all allocations at once
- Reusable: Reset and reuse across requests to keep warmed memory chunks live
- Default capacity: 16 KiB initial chunk (sized for typical query scratch state)
- Telemetry:
bytes_allocated()andchunk_count()for monitoring
Usage
use heliosdb_lite::runtime::RequestArena;
let arena = RequestArena::with_capacity(4 * 1024);let scratch = bumpalo::collections::Vec::with_capacity_in(16, arena.bump());// All allocations freed when `arena` is dropped -- no per-object free.Use Cases
- HNSW candidate queues during vector search
- BM25 term hash buffers
- Row pointer scratch space during parallel scans
- Visited bitsets during graph traversal
33. Compiled Query Plans
The PlanCache (src/sql/compiled.rs) eliminates repeated parse and plan cost for frequently executed SQL statements.
Components
| Component | Description |
|---|---|
PlanFingerprint | Stable 64-bit hash of normalized SQL (comments stripped, literals replaced with ?) |
CompiledPlan | Parsed + validated prepared statement with parameter slots |
PlanCache | Bounded LRU (default 512 entries), thread-safe via parking_lot::Mutex |
How It Works
1. fingerprint(sql) -> cache lookup hit? -> bind parameters, execute directly miss? -> parse + plan + optimize + insert into cacheNormalization
The fingerprint normalizer strips comments, collapses whitespace, lowercases keywords, and replaces literal values with positional placeholders. This ensures semantically identical queries with different literal values share a single cached plan.
34. SQL Cursors
HeliosDB supports server-side cursors for iterating over large result sets without loading all rows into memory.
Syntax
-- Declare a cursorDECLARE my_cursor CURSOR FOR SELECT * FROM large_table WHERE status = 'active';
-- Fetch rows in batchesFETCH 100 FROM my_cursor;FETCH 100 FROM my_cursor;
-- Close when doneCLOSE my_cursor;Behavior
- Cursors are scoped to the current session
- FETCH returns the next N rows from the cursor position
- CLOSE releases the cursor resources
- Cursors are automatically closed on transaction rollback and session disconnect
35. Metrics & Health Endpoints
HeliosDB exposes Prometheus-compatible metrics and a health check endpoint for production monitoring.
/health
Returns HTTP 200 with a JSON body indicating database status:
{"status": "healthy"}/metrics
Returns Prometheus-format metrics including:
- Query count and latency histograms
- Connection count (active, idle)
- Storage utilization
- Cache hit rates (plan, result, statistics)
- MVCC GC statistics
Quick Reference: SQL Command Summary
| Category | Commands |
|---|---|
| DDL | CREATE/ALTER/DROP TABLE, CREATE INDEX, CREATE VIEW, CREATE MATERIALIZED VIEW |
| DML | SELECT, INSERT, UPDATE, DELETE, UPSERT (ON CONFLICT) |
| Transactions | BEGIN, COMMIT, ROLLBACK, SAVEPOINT, RELEASE SAVEPOINT |
| Branching | CREATE BRANCH, DROP BRANCH, MERGE BRANCH, SELECT … ON BRANCH |
| Time-Travel | SELECT … AS OF TIMESTAMP |
| Backup | BACKUP DATABASE, RESTORE DATABASE, RECOVER DATABASE |
| Maintenance | VACUUM, VACUUM FULL, VACUUM ANALYZE |
| Security | CREATE ROLE, GRANT, REVOKE, ALTER TABLE … ENABLE ROW LEVEL SECURITY, CREATE POLICY |
| Explain | EXPLAIN, EXPLAIN ANALYZE, EXPLAIN (FORMAT JSON/YAML/TREE), EXPLAIN (STORAGE/AI/WHY_NOT/TRACE) |
| Prepared | PREPARE, EXECUTE, DEALLOCATE |
| Triggers | CREATE TRIGGER, DROP TRIGGER, CREATE FUNCTION |
| MV | CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW |
| Vector | CREATE INDEX … USING hnsw, <->, <#>, <=>, cosine_similarity(), l2_distance() |
| FTS | to_tsvector(), to_tsquery(), @@ |
| Graph | graph_bfs(), graph_shortest_path(), graph_a_star(), graph_neighbors() |
| BM25/Hybrid | BM25 inverted index, hybrid_search(), RRF, MMR rerankers |
| Cursors | DECLARE, FETCH, CLOSE |
| Monitoring | /metrics (Prometheus), /health endpoint |
| Storage | ALTER TABLE … ALTER COLUMN … SET STORAGE COLUMNAR/DICTIONARY/CONTENT_ADDRESSED |