Skip to content

HeliosDB-Full Performance Evolution vs PostgreSQL

HeliosDB-Full Performance Evolution vs PostgreSQL

Benchmark: 1000 rows, 5 columns (INT, TEXT, TEXT, INT, TEXT), PRIMARY KEY on id PostgreSQL: 16.11 (Docker), HeliosDB: v7.1.0 (compiled binary) Date: 2026-02-14

Latest Results — All Clients (same session, same system load)

TestPostgreSQLPython→HeliosRust→HeliosDirect API
Point Lookups132us75us90us1us
Full Scan859us323us751us628us
Aggregates148us55us83us224us
INSERT481us75us93us2us
UPDATE466us83us87us3us
DELETE450us70us71us2us
Mixed OLTP271us82us73us1us

Win/Loss vs PostgreSQL (latest run)

TestPython→HeliosRust→HeliosDirect API
Point Lookups1.8x W1.5x W132x W
Full Scan2.7x W1.1x W1.4x W
Aggregates2.7x W1.8x W1.5x L
INSERT6.4x W5.2x W241x W
UPDATE5.6x W5.4x W155x W
DELETE6.5x W6.3x W225x W
Mixed OLTP3.3x W3.7x W271x W
Score7W / 0L7W / 0L6W / 1L

Historical Optimization Rounds (avg latency, microseconds)

TestPostgreSQLBaselineRound 1Round 2Round 3Round 4Round 5Round 6
Point Lookups110us72us57us65us57us155us182us75us
Full Scan723us7000us1200us1100us1100us1400us1300us323us
Aggregates136us871us692us381us279us420us90us55us
INSERT450us860us733us70us66us212us233us75us
UPDATE515us1600us560us78us80us183us275us83us
DELETE571us1600us551us62us92us184us215us70us
Mixed OLTP285us921us541us215us67us215us275us82us
Score1W/6L1W/6L5W/2L5W/2L6W/1L7W/0L7W/0L

PostgreSQL column uses Baseline-run PG values. Rounds ran on different system loads; relative improvements are the key metric.

Client Comparison

ClientProtocolOverheadBest For
Python (psycopg2)PG wire (TCP)Python runtime + C extensionStandard benchmarking
Rust (tokio-postgres)PG wire (TCP, simple query)Minimal (async Rust)Low-overhead wire protocol
Direct APINone (in-process)Zero network/protocolMaximum throughput

Optimization Rounds

Baseline (commit 0bf96ff4)

  • Point lookup fast path (pk_index)
  • TCP_NODELAY
  • Equal PK conditions on both databases

Round 1 (commit d47f8c64) — 6 Optimizations

  • PK index fast path for UPDATE/DELETE (O(1) instead of table scan)
  • Batched DataRow wire writes (single write_all())
  • Streaming aggregates (single-pass, no materialization)
  • Stack-allocated integer formatting (itoa)
  • Skip autocommit INSERT overhead
  • Direct StoredRow-to-wire encoding

Round 2 (commit d90bc32e) — Write Path Revolution

  • DirectIoWal group commit (batch 256 entries / 5ms, eliminating per-write fsync)
  • COUNT(*) ultra-fast path (zero decode)
  • decode_values_only with raw MessagePack parser
  • Single-column numeric extraction for SUM/AVG

Round 3 (commit cf5f8649) — COUNT(*) via pk_index

  • Pre-scan COUNT(*) uses pk_index.len() — O(1), no LSM scan
  • Raw MessagePack column extractor for partial decode

Round 4 (commit 970d8d0d) — Zero-Allocation Full Scan

  • Raw MessagePack to PostgreSQL wire format directly (bypasses serde + StoredValue + ParameterValue)
  • pending_wire_buf field for pre-built DataRow buffer
  • Full Scan flipped from PostgreSQL winning to HeliosDB winning

Round 5 (commit 702f6b12) — Incremental Aggregate Accumulators

  • TableAggregates struct: per-column running SUM/COUNT maintained on every INSERT/UPDATE/DELETE
  • O(1) aggregate lookups for COUNT/SUM/AVG without WHERE clause
  • Scan cache with write-invalidation (500ms TTL)
  • Aggregates flipped from PostgreSQL 1.8x faster to HeliosDB 3.2x FASTER

Round 6 — Wire-Format Cache

  • Cache pre-built PostgreSQL DataRow buffer (wire_cache) instead of raw scan data
  • On cache hit: clone ~60KB wire bytes directly → zero scan, zero decode, zero encode
  • Full Scan flipped from 2.3x LOSS to 2.7x WIN (1.7ms → 323us)
  • HeliosDB wins all 7/7 categories across all client types

Rust Benchmark Binary (commit 6b66f1f7)

  • heliosdb-bench --mode rust-client: tokio-postgres simple query protocol
  • heliosdb-bench --mode direct: Zero-network LsmStorageEngine API
  • heliosdb-bench --mode both: Run both modes sequentially

Key Techniques Summary

OptimizationImpactCategory
Direct API (no network)132-271x point opsAll categories
DirectIoWal group commit6-13x write speedupINSERT/UPDATE/DELETE
pk_index O(1) lookup3-10x point opsPoint/UPDATE/DELETE
Wire-format cache5.3x scan speedupFull Scan
Incremental aggregates2-7x aggregate speedupAggregates
Zero-alloc MessagePack→wire1.5x scan speedupFull Scan
Batched wire writes1.3x output speedupFull Scan/Aggregates
Skip autocommit overhead1.2x write speedupINSERT
Scan cache (500ms TTL)Avoids repeated LSM scansMixed OLTP