Performance Benchmarks

Transparent, Reproducible, Fair

Every performance claim we make is backed by open-source benchmark code that anyone can run. We believe that database benchmarks should be transparent in methodology, reproducible in any environment, and fair to all systems under comparison. This document describes exactly how we conduct our benchmarks -- from hardware configuration to statistical analysis -- so you can evaluate our results with full confidence and reproduce them yourself.

We follow three principles:

No cherry-picking. We report all 35 query categories, including the ones where PostgreSQL wins.
Equal resources. Both systems get the same CPU, memory, and storage allocation.
Default configurations. HeliosDB uses out-of-the-box defaults. PostgreSQL gets recommended production tuning (which favors PostgreSQL).

Benchmark Categories

We run four distinct types of benchmarks, each designed to measure a different dimension of database performance.

Micro-benchmarks (Single-Operation Latency)

Measure the raw cost of individual operations in an embedded deployment, with no network overhead. These isolate the storage engine and query executor.

Operation	What It Measures
Point lookup (PK)	ART index traversal + MVCC read
Single-row INSERT	Write path + WAL + commit
Range scan (100 rows)	Sequential read throughput
Aggregate (COUNT/SUM)	Full-table scan + computation
Index creation	Bulk ART construction

Scalability Benchmarks (Throughput vs. Concurrency)

Measure how throughput scales as concurrent clients increase, using pgbench over the PostgreSQL wire protocol.

Concurrency Level	Purpose
10 clients	Baseline contention-free performance
50 clients	Moderate concurrency
100 clients	Production-typical load
200 clients	High-concurrency stress
500 clients	Connection pool saturation
1000 clients	Extreme concurrency

Comparison Benchmarks (HeliosDB vs. PostgreSQL 16)

Head-to-head comparison across 35 SQL query categories. Both engines receive identical schemas, identical data, and identical queries. The embedded Rust-native comparison eliminates network variables.

35 Query Categories

#	Category	#	Category
1	CREATE TABLE	19	LEFT JOIN
2	CREATE INDEX	20	Multi-table JOIN (4 tables)
3	ALTER TABLE	21	Scalar subquery
4	DROP TABLE	22	EXISTS subquery
5	CREATE/DROP VIEW	23	IN subquery
6	REFRESH Materialized View	24	Common Table Expression (CTE)
7	TRUNCATE	25	Recursive CTE
8	INSERT (single-row)	26	Window functions
9	INSERT (multi-row)	27	UNION
10	INSERT...SELECT	28	DISTINCT
11	UPDATE (point)	29	ORDER BY + LIMIT
12	DELETE (point)	30	CASE expressions
13	UPSERT (ON CONFLICT)	31	JSON filter
14	UPDATE with subquery	32	LIKE / BETWEEN / IN
15	Point lookup (PK)	33	Transaction control
16	Full scan + filter	34	Prepared statements
17	Aggregation	35	SET / SHOW / RESET
18	INNER JOIN

Docker Benchmarks (Wire Protocol via pgbench)

Compare HeliosDB + HeliosProxy against PostgreSQL + PgBouncer in Docker containers, using pgbench as the standard load generator. This tests real-world deployment conditions: network overhead, connection pooling, and container resource constraints.

Hardware and Environment

Standard Test Environment

All published benchmarks are run on the following reference hardware unless otherwise noted:

Component	Specification
CPU	8-core (x86_64)
Memory	32 GB DDR4
Storage	NVMe SSD (sequential read >3 GB/s)
OS	Linux (kernel 5.14+)
Rust	stable (latest)
Docker	24.x with Compose v2

Docker Benchmark Resource Allocation

Both database systems receive equal container resources to ensure a fair comparison:

# PostgreSQL container
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 2G

# HeliosDB container
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 2G

PostgreSQL Tuning

PostgreSQL receives production-recommended tuning. This favors PostgreSQL, since HeliosDB uses default configuration.

shared_buffers = 512MB
effective_cache_size = 2GB
work_mem = 64MB
max_connections = 500

HeliosDB Configuration

HeliosDB runs with default out-of-the-box settings. No special tuning is applied. This demonstrates real-world performance for users who deploy without manual optimization.

Methodology

Test Execution Protocol

Each benchmark follows the same execution protocol:

1. Schema creation (identical DDL for both systems)
2. Data population (identical rows, same cardinality)
3. Warmup phase: 30 seconds (queries run but not measured)
4. Measurement phase: 30 seconds (all metrics collected)
5. Cooldown and results aggregation

Both the warmup and measurement durations are configurable. Published results use 30-second windows unless stated otherwise.

Repetition and Reporting

Each test configuration runs 3 times
The median of the 3 runs is reported (not the best, not the average)
This eliminates single-run anomalies while avoiding outlier influence

Benchmark Tools

Benchmark Type	Tool	Rationale
Wire protocol (Docker)	pgbench (PostgreSQL 16)	Industry-standard OLTP benchmark tool
Embedded comparison	Rust-native harness	Eliminates network overhead; measures engine directly
Scalability	pgbench with varying `-c` flag	Standard concurrency scaling methodology

Query Design

Queries are designed to isolate specific operations:

-- Point lookup (PK index scan)
SELECT * FROM customers WHERE id = 42;

-- Range scan with filter
SELECT * FROM customers WHERE age BETWEEN 25 AND 35;

-- Aggregation
SELECT region, COUNT(*), AVG(age) FROM customers GROUP BY region;

-- INNER JOIN (2 tables)
SELECT c.name, o.order_id, o.total
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id
WHERE c.region = 'East';

-- Multi-table JOIN (4 tables)
SELECT c.name, o.order_id, p.name, oi.quantity
FROM customers c
JOIN orders o ON o.customer_id = c.id
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status = 'shipped';

-- Window function
SELECT name, age, region,
       ROW_NUMBER() OVER (PARTITION BY region ORDER BY age DESC)
FROM customers;

-- Recursive CTE
WITH RECURSIVE tree AS (
    SELECT cat_id, name, parent_id, 0 AS depth
    FROM categories WHERE parent_id IS NULL
    UNION ALL
    SELECT c.cat_id, c.name, c.parent_id, t.depth + 1
    FROM categories c JOIN tree t ON c.parent_id = t.cat_id
)
SELECT * FROM tree ORDER BY depth, cat_id;

-- EXISTS subquery
SELECT c.name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.status = 'shipped'
);

-- LIKE / BETWEEN / IN
SELECT * FROM customers
WHERE name LIKE 'Customer_1%'
  AND age BETWEEN 20 AND 50
  AND region IN ('East', 'West');

Test Data

Both systems are populated with identical data:

Table	Rows	Description
customers	200	7 columns including name, email, age, region, metadata
products	50	5 columns including category, price, description
orders	500	5 columns with FK to customers
order_items	1,000	5 columns with FKs to orders and products
categories	20	Hierarchical (self-referencing parent_id)

Metrics Collected

Primary Metrics

Metric	Unit	Description
TPS	transactions/sec	Completed transactions per second during measurement window
P50 latency	microseconds	Median latency (50th percentile)
P95 latency	microseconds	95th percentile latency
P99 latency	microseconds	99th percentile latency

Derived Metrics

Metric	Formula	Description
Scaling factor	TPS(N threads) / TPS(1 thread)	How well throughput scales with concurrency
Scaling efficiency	Scaling factor / N	Percentage of ideal linear scaling
Comparison ratio	HeliosDB time / PostgreSQL time	Values < 1.0 mean HeliosDB is faster
Speedup	1 / comparison ratio	e.g., ratio 0.15 = 6.9x speedup

Per-Phase Tracing (HeliosDB Only)

HeliosDB benchmarks also collect internal per-phase timing via the built-in tracing system:

Phase	What It Measures
Parse	SQL text to AST (sqlparser-rs)
Plan	AST to logical plan
Optimize	Rule-based and cost-based optimization passes
Execute	Plan execution against storage engine

Enable tracing with:

SET helios.trace_queries = on;
-- Run queries...
SHOW helios.trace_report;

Statistical Rigor

Outlier Handling

The first and last 5% of samples in each measurement window are discarded
This eliminates JIT warmup artifacts and OS scheduling noise at the tail
Remaining samples are used for all percentile calculations

Confidence Intervals

Comparison benchmarks report 95% confidence intervals for the ratio
A result is only reported as a "win" if the confidence interval does not cross 1.0

"Win" Definition

A system "wins" a category only if it is more than 5% faster than the other system. Results within 5% are reported as comparable (within noise margin).

Ratio Range	Verdict
< 0.95	HeliosDB wins
0.95 -- 1.05	Comparable (within noise)
> 1.05	PostgreSQL wins

Reporting Honesty

We report all 35 categories, including losses
Tables include the raw ratio so readers can judge for themselves
Where PostgreSQL wins, we explain why (and whether we plan to address it)

Comparison Methodology (vs. PostgreSQL)

Ground Rules

Rule	Details
Same queries	Identical SQL text (adjusted only for syntax differences where necessary)
Same data	Identical row counts, identical values, same cardinality
Same hardware	Both run on the same machine (embedded) or same Docker host (wire protocol)
Warm caches	Both systems execute warmup queries before measurement begins
Index fairness	PostgreSQL uses B-tree indexes; HeliosDB uses ART indexes (each system's default)
Configuration	PostgreSQL gets production tuning (shared_buffers=512MB); HeliosDB uses defaults

How Ratios Are Computed

ratio = HeliosDB average time / PostgreSQL average time

Ratio	Meaning
0.14	HeliosDB is 6.9x faster
0.40	HeliosDB is 2.5x faster
1.00	Identical performance
1.20	PostgreSQL is 1.2x faster

Schema and Index Comparison

PostgreSQL:  B-tree index on primary keys    (O(log n) lookup)
HeliosDB:    ART index on primary keys       (O(k) lookup, k = key length)

Both systems create indexes on the same columns. The index type reflects each system's default and strength.

Current Results Summary

35-Category Comparison (Embedded, Rust-Native)

#	Category	HeliosDB	PostgreSQL	Ratio	Winner
1	Point lookup (PK)	~39 us	~270 us	0.14	HeliosDB (6.9x)
2	Full scan + filter	~1.1 ms	~1.5 ms	0.73	HeliosDB (1.4x)
3	Aggregation	~1.6 ms	~3.5 ms	0.46	HeliosDB (2.2x)
4	INNER JOIN	~4.6 ms	~11.0 ms	0.42	HeliosDB (2.4x)
5	LEFT JOIN	~4.6 ms	~11.5 ms	0.40	HeliosDB (2.5x)
6	Multi-table JOIN (4)	~15 ms	~33 ms	0.45	HeliosDB (2.2x)
7	EXISTS subquery	~5.5 ms	~18 ms	0.31	HeliosDB (3.3x)
8	CTE	~6.8 ms	~14 ms	0.49	HeliosDB (2.0x)
9	Window functions	~1.2 ms	~3.0 ms	0.40	HeliosDB (2.5x)
10	LIKE/BETWEEN/IN	~1.9 ms	~2.8 ms	0.68	HeliosDB (1.5x)
11	CASE expressions	~1.5 ms	~2.9 ms	0.52	HeliosDB (1.9x)
12	ORDER BY + LIMIT	~2.2 ms	~4.0 ms	0.55	HeliosDB (1.8x)
13	DISTINCT	~1.5 ms	~3.2 ms	0.47	HeliosDB (2.1x)
14	UNION	~2.7 ms	~5.5 ms	0.49	HeliosDB (2.0x)
15	REFRESH MV	~289 ms	~50 ms	5.78	PostgreSQL (5.8x)

Full 35-category results are generated by the benchmark suite. The table above highlights representative categories.

Headline Numbers

Metric	HeliosDB	PostgreSQL	Speedup
Point lookup (PK)	~39 us	~270 us	6.9x faster
JOIN queries (2-table)	~4.6 ms	~11 ms	2.4x faster
JOIN queries (4-table)	~15 ms	~33 ms	2.2x faster
Aggregations	~1.6 ms	~3.5 ms	2.2x faster
EXISTS (semi-join)	~5.5 ms	~18 ms	3.3x faster
LIKE/BETWEEN/IN	~1.9 ms	~2.8 ms	1.5x faster
Full table scan	~1.1 ms	~1.5 ms	Comparable (I/O bound)

Scalability Results (Wire Protocol, Docker)

Clients	HeliosDB TPS	Scaling Factor	Efficiency
10	242	1.0x	100%
50	1,209	5.0x	100%
100	2,419	10.0x	100%
200	4,825	19.9x	100%
500	11,775	48.7x	97%
1,000	20,491	84.7x	85%

HeliosDB demonstrates near-linear scaling up to 500 concurrent clients (97% efficiency) and maintains 85% efficiency at 1,000 clients.

Where PostgreSQL Wins

We report losses honestly:

Category	Why PostgreSQL Wins
REFRESH Materialized View	PostgreSQL's bulk COPY path is highly optimized for batch writes. HeliosDB's MV refresh goes through the standard write path. This is a known gap we are actively improving.

Reproducing Our Benchmarks

All benchmark code is open-source in the HeliosDB repository.

Rust-Native Comparison (Embedded, No Network)

Requires a local PostgreSQL 16 instance for the comparison side:

# Start PostgreSQL 16 for comparison
docker run -d --name pg_bench_16 \
  -e POSTGRES_USER=bench \
  -e POSTGRES_PASSWORD=benchpass \
  -e POSTGRES_DB=benchdb \
  -p 25432:5432 \
  postgres:16-alpine

# Run the 35-category comparison benchmark
cargo test --release --test pg_comparison_benchmark -- --nocapture --ignored

The benchmark will:

Create identical schemas in both HeliosDB (embedded) and PostgreSQL
Populate both with the same test data (200 customers, 50 products, 500 orders, 1000 order items, 20 categories)
Run each of the 35 query categories with warmup + measurement
Print a comparison table with ratios and winners

Docker Benchmarks (Wire Protocol via pgbench)

# Start the full benchmark environment
docker compose -f benchmarks/docker/docker-compose.benchmark.yml up -d

# Wait for all services to be healthy
docker compose -f benchmarks/docker/docker-compose.benchmark.yml ps

# Run benchmarks from the runner container
docker exec -it benchmark-runner bash /scripts/run-benchmarks.sh

Scalability Benchmarks

# Start the scalability test environment
docker compose -f benchmarks/docker/docker-compose.scalability.yml up -d

# Run scalability sweep (10, 50, 100, 200, 500, 1000 clients)
docker exec -it benchmark-runner bash /scripts/scalability-benchmark.sh

Interpreting Results

The benchmark suite outputs a markdown table. Key columns:

Column	Meaning
`HeliosDB`	Average time per iteration for HeliosDB
`PostgreSQL`	Average time per iteration for PostgreSQL
`Ratio`	HeliosDB / PostgreSQL (< 1.0 = HeliosDB faster)
`Winner`	Which system won (with speedup factor)
`Bottleneck`	Internal phase that dominated HeliosDB execution time

Environment Variables

Variable	Default	Description
`BENCH_WARMUP_SECS`	30	Duration of warmup phase
`BENCH_MEASURE_SECS`	30	Duration of measurement phase
`BENCH_ITERATIONS`	20	Iterations per query category (embedded benchmarks)
`PG_CONNSTR`	`host=localhost port=25432 user=bench password=benchpass dbname=benchdb`	PostgreSQL connection string

Limitations and Caveats

We believe in full disclosure about what our benchmarks do and do not show:

Embedded vs. client-server. The 35-category comparison runs HeliosDB in embedded mode (no network). PostgreSQL always has network overhead (localhost TCP). This favors HeliosDB for latency-sensitive micro-benchmarks. The Docker benchmarks level this by putting both systems behind wire protocols.
Dataset size. The comparison benchmark uses a small dataset (~1,770 total rows across 5 tables). This fits entirely in memory for both systems. Large-dataset benchmarks (millions of rows, disk-spill scenarios) are planned but not yet published.
Write-heavy workloads. Our scalability benchmarks primarily test read scaling. Write-heavy OLTP (TPC-B style) benchmarks show PostgreSQL's maturity advantage. We report these results in the Docker benchmark reports.
Single-node only. All benchmarks are single-node. Distributed/sharded benchmarks are not yet available.
Index type difference. HeliosDB uses ART indexes (O(k) lookup); PostgreSQL uses B-tree (O(log n)). This is each system's default and native strength, but it is not an apples-to-apples index comparison.

Last updated: February 2026. Benchmark code and results are available in the benchmarks/ and tests/ directories of the HeliosDB repository.

Performance Benchmarks

Transparent, Reproducible, Fair

Benchmark Categories

Micro-benchmarks (Single-Operation Latency)

Scalability Benchmarks (Throughput vs. Concurrency)

Comparison Benchmarks (HeliosDB vs. PostgreSQL 16)

35 Query Categories

Docker Benchmarks (Wire Protocol via pgbench)

Hardware and Environment

Standard Test Environment

Docker Benchmark Resource Allocation

PostgreSQL Tuning

HeliosDB Configuration

Methodology

Test Execution Protocol

Repetition and Reporting

Benchmark Tools

Query Design

Test Data

Metrics Collected

Primary Metrics

Derived Metrics

Per-Phase Tracing (HeliosDB Only)

Statistical Rigor

Outlier Handling

Confidence Intervals

"Win" Definition

Reporting Honesty

Comparison Methodology (vs. PostgreSQL)

Ground Rules

How Ratios Are Computed

Schema and Index Comparison

Current Results Summary

35-Category Comparison (Embedded, Rust-Native)

Headline Numbers

Scalability Results (Wire Protocol, Docker)

Where PostgreSQL Wins

Reproducing Our Benchmarks

Rust-Native Comparison (Embedded, No Network)

Docker Benchmarks (Wire Protocol via pgbench)

Scalability Benchmarks

Interpreting Results

Environment Variables

Limitations and Caveats

Run the benchmarks yourself