Transparent, Reproducible, Fair

Every performance claim we make is backed by open-source benchmark code that anyone can run. We believe that database benchmarks should be transparent in methodology, reproducible in any environment, and fair to all systems under comparison. This document describes exactly how we conduct our benchmarks -- from hardware configuration to statistical analysis -- so you can evaluate our results with full confidence and reproduce them yourself.

We follow three principles:

  1. No cherry-picking. We report all 35 query categories, including the ones where PostgreSQL wins.
  2. Equal resources. Both systems get the same CPU, memory, and storage allocation.
  3. Default configurations. HeliosDB uses out-of-the-box defaults. PostgreSQL gets recommended production tuning (which favors PostgreSQL).

Benchmark Categories

We run four distinct types of benchmarks, each designed to measure a different dimension of database performance.

Micro-benchmarks (Single-Operation Latency)

Measure the raw cost of individual operations in an embedded deployment, with no network overhead. These isolate the storage engine and query executor.

OperationWhat It Measures
Point lookup (PK)ART index traversal + MVCC read
Single-row INSERTWrite path + WAL + commit
Range scan (100 rows)Sequential read throughput
Aggregate (COUNT/SUM)Full-table scan + computation
Index creationBulk ART construction

Scalability Benchmarks (Throughput vs. Concurrency)

Measure how throughput scales as concurrent clients increase, using pgbench over the PostgreSQL wire protocol.

Concurrency LevelPurpose
10 clientsBaseline contention-free performance
50 clientsModerate concurrency
100 clientsProduction-typical load
200 clientsHigh-concurrency stress
500 clientsConnection pool saturation
1000 clientsExtreme concurrency

Comparison Benchmarks (HeliosDB vs. PostgreSQL 16)

Head-to-head comparison across 35 SQL query categories. Both engines receive identical schemas, identical data, and identical queries. The embedded Rust-native comparison eliminates network variables.

35 Query Categories

#Category#Category
1CREATE TABLE19LEFT JOIN
2CREATE INDEX20Multi-table JOIN (4 tables)
3ALTER TABLE21Scalar subquery
4DROP TABLE22EXISTS subquery
5CREATE/DROP VIEW23IN subquery
6REFRESH Materialized View24Common Table Expression (CTE)
7TRUNCATE25Recursive CTE
8INSERT (single-row)26Window functions
9INSERT (multi-row)27UNION
10INSERT...SELECT28DISTINCT
11UPDATE (point)29ORDER BY + LIMIT
12DELETE (point)30CASE expressions
13UPSERT (ON CONFLICT)31JSON filter
14UPDATE with subquery32LIKE / BETWEEN / IN
15Point lookup (PK)33Transaction control
16Full scan + filter34Prepared statements
17Aggregation35SET / SHOW / RESET
18INNER JOIN

Docker Benchmarks (Wire Protocol via pgbench)

Compare HeliosDB + HeliosProxy against PostgreSQL + PgBouncer in Docker containers, using pgbench as the standard load generator. This tests real-world deployment conditions: network overhead, connection pooling, and container resource constraints.


Hardware and Environment

Standard Test Environment

All published benchmarks are run on the following reference hardware unless otherwise noted:

ComponentSpecification
CPU8-core (x86_64)
Memory32 GB DDR4
StorageNVMe SSD (sequential read >3 GB/s)
OSLinux (kernel 5.14+)
Ruststable (latest)
Docker24.x with Compose v2

Docker Benchmark Resource Allocation

Both database systems receive equal container resources to ensure a fair comparison:

# PostgreSQL container
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 2G

# HeliosDB container
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 2G

PostgreSQL Tuning

PostgreSQL receives production-recommended tuning. This favors PostgreSQL, since HeliosDB uses default configuration.

shared_buffers = 512MB
effective_cache_size = 2GB
work_mem = 64MB
max_connections = 500

HeliosDB Configuration

HeliosDB runs with default out-of-the-box settings. No special tuning is applied. This demonstrates real-world performance for users who deploy without manual optimization.


Methodology

Test Execution Protocol

Each benchmark follows the same execution protocol:

1. Schema creation (identical DDL for both systems)
2. Data population (identical rows, same cardinality)
3. Warmup phase: 30 seconds (queries run but not measured)
4. Measurement phase: 30 seconds (all metrics collected)
5. Cooldown and results aggregation

Both the warmup and measurement durations are configurable. Published results use 30-second windows unless stated otherwise.

Repetition and Reporting

  • Each test configuration runs 3 times
  • The median of the 3 runs is reported (not the best, not the average)
  • This eliminates single-run anomalies while avoiding outlier influence

Benchmark Tools

Benchmark TypeToolRationale
Wire protocol (Docker)pgbench (PostgreSQL 16)Industry-standard OLTP benchmark tool
Embedded comparisonRust-native harnessEliminates network overhead; measures engine directly
Scalabilitypgbench with varying -c flagStandard concurrency scaling methodology

Query Design

Queries are designed to isolate specific operations:

-- Point lookup (PK index scan)
SELECT * FROM customers WHERE id = 42;

-- Range scan with filter
SELECT * FROM customers WHERE age BETWEEN 25 AND 35;

-- Aggregation
SELECT region, COUNT(*), AVG(age) FROM customers GROUP BY region;

-- INNER JOIN (2 tables)
SELECT c.name, o.order_id, o.total
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id
WHERE c.region = 'East';

-- Multi-table JOIN (4 tables)
SELECT c.name, o.order_id, p.name, oi.quantity
FROM customers c
JOIN orders o ON o.customer_id = c.id
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
WHERE o.status = 'shipped';

-- Window function
SELECT name, age, region,
       ROW_NUMBER() OVER (PARTITION BY region ORDER BY age DESC)
FROM customers;

-- Recursive CTE
WITH RECURSIVE tree AS (
    SELECT cat_id, name, parent_id, 0 AS depth
    FROM categories WHERE parent_id IS NULL
    UNION ALL
    SELECT c.cat_id, c.name, c.parent_id, t.depth + 1
    FROM categories c JOIN tree t ON c.parent_id = t.cat_id
)
SELECT * FROM tree ORDER BY depth, cat_id;

-- EXISTS subquery
SELECT c.name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.status = 'shipped'
);

-- LIKE / BETWEEN / IN
SELECT * FROM customers
WHERE name LIKE 'Customer_1%'
  AND age BETWEEN 20 AND 50
  AND region IN ('East', 'West');

Test Data

Both systems are populated with identical data:

TableRowsDescription
customers2007 columns including name, email, age, region, metadata
products505 columns including category, price, description
orders5005 columns with FK to customers
order_items1,0005 columns with FKs to orders and products
categories20Hierarchical (self-referencing parent_id)

Metrics Collected

Primary Metrics

MetricUnitDescription
TPStransactions/secCompleted transactions per second during measurement window
P50 latencymicrosecondsMedian latency (50th percentile)
P95 latencymicroseconds95th percentile latency
P99 latencymicroseconds99th percentile latency

Derived Metrics

MetricFormulaDescription
Scaling factorTPS(N threads) / TPS(1 thread)How well throughput scales with concurrency
Scaling efficiencyScaling factor / NPercentage of ideal linear scaling
Comparison ratioHeliosDB time / PostgreSQL timeValues < 1.0 mean HeliosDB is faster
Speedup1 / comparison ratioe.g., ratio 0.15 = 6.9x speedup

Per-Phase Tracing (HeliosDB Only)

HeliosDB benchmarks also collect internal per-phase timing via the built-in tracing system:

PhaseWhat It Measures
ParseSQL text to AST (sqlparser-rs)
PlanAST to logical plan
OptimizeRule-based and cost-based optimization passes
ExecutePlan execution against storage engine

Enable tracing with:

SET helios.trace_queries = on;
-- Run queries...
SHOW helios.trace_report;

Statistical Rigor

Outlier Handling

  • The first and last 5% of samples in each measurement window are discarded
  • This eliminates JIT warmup artifacts and OS scheduling noise at the tail
  • Remaining samples are used for all percentile calculations

Confidence Intervals

  • Comparison benchmarks report 95% confidence intervals for the ratio
  • A result is only reported as a "win" if the confidence interval does not cross 1.0

"Win" Definition

A system "wins" a category only if it is more than 5% faster than the other system. Results within 5% are reported as comparable (within noise margin).

Ratio RangeVerdict
< 0.95HeliosDB wins
0.95 -- 1.05Comparable (within noise)
> 1.05PostgreSQL wins

Reporting Honesty

  • We report all 35 categories, including losses
  • Tables include the raw ratio so readers can judge for themselves
  • Where PostgreSQL wins, we explain why (and whether we plan to address it)

Comparison Methodology (vs. PostgreSQL)

Ground Rules

RuleDetails
Same queriesIdentical SQL text (adjusted only for syntax differences where necessary)
Same dataIdentical row counts, identical values, same cardinality
Same hardwareBoth run on the same machine (embedded) or same Docker host (wire protocol)
Warm cachesBoth systems execute warmup queries before measurement begins
Index fairnessPostgreSQL uses B-tree indexes; HeliosDB uses ART indexes (each system's default)
ConfigurationPostgreSQL gets production tuning (shared_buffers=512MB); HeliosDB uses defaults

How Ratios Are Computed

ratio = HeliosDB average time / PostgreSQL average time
RatioMeaning
0.14HeliosDB is 6.9x faster
0.40HeliosDB is 2.5x faster
1.00Identical performance
1.20PostgreSQL is 1.2x faster

Schema and Index Comparison

PostgreSQL:  B-tree index on primary keys    (O(log n) lookup)
HeliosDB:    ART index on primary keys       (O(k) lookup, k = key length)

Both systems create indexes on the same columns. The index type reflects each system's default and strength.


Current Results Summary

35-Category Comparison (Embedded, Rust-Native)

#CategoryHeliosDBPostgreSQLRatioWinner
1Point lookup (PK)~39 us~270 us0.14HeliosDB (6.9x)
2Full scan + filter~1.1 ms~1.5 ms0.73HeliosDB (1.4x)
3Aggregation~1.6 ms~3.5 ms0.46HeliosDB (2.2x)
4INNER JOIN~4.6 ms~11.0 ms0.42HeliosDB (2.4x)
5LEFT JOIN~4.6 ms~11.5 ms0.40HeliosDB (2.5x)
6Multi-table JOIN (4)~15 ms~33 ms0.45HeliosDB (2.2x)
7EXISTS subquery~5.5 ms~18 ms0.31HeliosDB (3.3x)
8CTE~6.8 ms~14 ms0.49HeliosDB (2.0x)
9Window functions~1.2 ms~3.0 ms0.40HeliosDB (2.5x)
10LIKE/BETWEEN/IN~1.9 ms~2.8 ms0.68HeliosDB (1.5x)
11CASE expressions~1.5 ms~2.9 ms0.52HeliosDB (1.9x)
12ORDER BY + LIMIT~2.2 ms~4.0 ms0.55HeliosDB (1.8x)
13DISTINCT~1.5 ms~3.2 ms0.47HeliosDB (2.1x)
14UNION~2.7 ms~5.5 ms0.49HeliosDB (2.0x)
15REFRESH MV~289 ms~50 ms5.78PostgreSQL (5.8x)

Full 35-category results are generated by the benchmark suite. The table above highlights representative categories.

Headline Numbers

MetricHeliosDBPostgreSQLSpeedup
Point lookup (PK)~39 us~270 us6.9x faster
JOIN queries (2-table)~4.6 ms~11 ms2.4x faster
JOIN queries (4-table)~15 ms~33 ms2.2x faster
Aggregations~1.6 ms~3.5 ms2.2x faster
EXISTS (semi-join)~5.5 ms~18 ms3.3x faster
LIKE/BETWEEN/IN~1.9 ms~2.8 ms1.5x faster
Full table scan~1.1 ms~1.5 msComparable (I/O bound)

Scalability Results (Wire Protocol, Docker)

ClientsHeliosDB TPSScaling FactorEfficiency
102421.0x100%
501,2095.0x100%
1002,41910.0x100%
2004,82519.9x100%
50011,77548.7x97%
1,00020,49184.7x85%

HeliosDB demonstrates near-linear scaling up to 500 concurrent clients (97% efficiency) and maintains 85% efficiency at 1,000 clients.

Where PostgreSQL Wins

We report losses honestly:

CategoryWhy PostgreSQL Wins
REFRESH Materialized ViewPostgreSQL's bulk COPY path is highly optimized for batch writes. HeliosDB's MV refresh goes through the standard write path. This is a known gap we are actively improving.

Reproducing Our Benchmarks

All benchmark code is open-source in the HeliosDB repository.

Rust-Native Comparison (Embedded, No Network)

Requires a local PostgreSQL 16 instance for the comparison side:

# Start PostgreSQL 16 for comparison
docker run -d --name pg_bench_16 \
  -e POSTGRES_USER=bench \
  -e POSTGRES_PASSWORD=benchpass \
  -e POSTGRES_DB=benchdb \
  -p 25432:5432 \
  postgres:16-alpine

# Run the 35-category comparison benchmark
cargo test --release --test pg_comparison_benchmark -- --nocapture --ignored

The benchmark will:

  1. Create identical schemas in both HeliosDB (embedded) and PostgreSQL
  2. Populate both with the same test data (200 customers, 50 products, 500 orders, 1000 order items, 20 categories)
  3. Run each of the 35 query categories with warmup + measurement
  4. Print a comparison table with ratios and winners

Docker Benchmarks (Wire Protocol via pgbench)

# Start the full benchmark environment
docker compose -f benchmarks/docker/docker-compose.benchmark.yml up -d

# Wait for all services to be healthy
docker compose -f benchmarks/docker/docker-compose.benchmark.yml ps

# Run benchmarks from the runner container
docker exec -it benchmark-runner bash /scripts/run-benchmarks.sh

Scalability Benchmarks

# Start the scalability test environment
docker compose -f benchmarks/docker/docker-compose.scalability.yml up -d

# Run scalability sweep (10, 50, 100, 200, 500, 1000 clients)
docker exec -it benchmark-runner bash /scripts/scalability-benchmark.sh

Interpreting Results

The benchmark suite outputs a markdown table. Key columns:

ColumnMeaning
HeliosDBAverage time per iteration for HeliosDB
PostgreSQLAverage time per iteration for PostgreSQL
RatioHeliosDB / PostgreSQL (< 1.0 = HeliosDB faster)
WinnerWhich system won (with speedup factor)
BottleneckInternal phase that dominated HeliosDB execution time

Environment Variables

VariableDefaultDescription
BENCH_WARMUP_SECS30Duration of warmup phase
BENCH_MEASURE_SECS30Duration of measurement phase
BENCH_ITERATIONS20Iterations per query category (embedded benchmarks)
PG_CONNSTRhost=localhost port=25432 user=bench password=benchpass dbname=benchdbPostgreSQL connection string

Limitations and Caveats

We believe in full disclosure about what our benchmarks do and do not show:

  1. Embedded vs. client-server. The 35-category comparison runs HeliosDB in embedded mode (no network). PostgreSQL always has network overhead (localhost TCP). This favors HeliosDB for latency-sensitive micro-benchmarks. The Docker benchmarks level this by putting both systems behind wire protocols.
  2. Dataset size. The comparison benchmark uses a small dataset (~1,770 total rows across 5 tables). This fits entirely in memory for both systems. Large-dataset benchmarks (millions of rows, disk-spill scenarios) are planned but not yet published.
  3. Write-heavy workloads. Our scalability benchmarks primarily test read scaling. Write-heavy OLTP (TPC-B style) benchmarks show PostgreSQL's maturity advantage. We report these results in the Docker benchmark reports.
  4. Single-node only. All benchmarks are single-node. Distributed/sharded benchmarks are not yet available.
  5. Index type difference. HeliosDB uses ART indexes (O(k) lookup); PostgreSQL uses B-tree (O(log n)). This is each system's default and native strength, but it is not an apples-to-apples index comparison.

Last updated: February 2026. Benchmark code and results are available in the benchmarks/ and tests/ directories of the HeliosDB repository.

Run the benchmarks yourself

All benchmark code is open source. Verify our claims on your own hardware.

Get Started Contact Sales