HTAP & Columnar Storage
HTAP & Columnar Storage
HeliosDB-Lite supports Hybrid Transactional/Analytical Processing (HTAP) with per-column storage modes. Columnar storage groups values from the same column together in batches, improving compression and analytical query performance.
Prerequisites
- HeliosDB-Lite v3.5 or later
- Access to the SQL shell (REPL, PostgreSQL wire protocol, or REST API)
Step 1 — Create a Table with Columnar Columns
Mark individual columns as STORAGE COLUMNAR when the column will be used
primarily for analytical reads (aggregations, scans, range filters).
CREATE TABLE metrics ( id INT PRIMARY KEY, sensor_id INT, timestamp INT8 STORAGE COLUMNAR, value FLOAT8 STORAGE COLUMNAR, label TEXT);The id, sensor_id, and label columns use the default row-oriented storage.
The timestamp and value columns are stored in columnar batches of 1024
values each, optimized for sequential reads.
Step 2 — Insert Data
Insert data exactly as you would with a row table. HeliosDB handles the columnar layout transparently.
INSERT INTO metrics VALUES (1, 10, 1711900000, 23.5, 'temperature');INSERT INTO metrics VALUES (2, 10, 1711900060, 23.7, 'temperature');INSERT INTO metrics VALUES (3, 20, 1711900000, 55.0, 'humidity');INSERT INTO metrics VALUES (4, 20, 1711900060, 54.8, 'humidity');Step 3 — Run an Analytical Query
Columnar columns shine during aggregations that touch many rows but few columns.
SELECT sensor_id, AVG(value) AS avg_value, COUNT(*) AS readingsFROM metricsGROUP BY sensor_id;Expected output:
sensor_id | avg_value | readings-----------+-----------+--------- 10 | 23.60 | 2 20 | 54.90 | 2Step 4 — Inspect the Query Plan
Use EXPLAIN to confirm HeliosDB uses a Columnar Scan for columns stored in
columnar mode.
EXPLAIN SELECT AVG(value) FROM metrics;Expected output (excerpt):
Aggregate -> Columnar Scan on metrics (columns: value)The plan shows Columnar Scan instead of a full row scan, meaning only the
value column data is read from storage.
Step 5 — Workload Classification
EXPLAIN output includes workload classification, scoring the query as OLTP or OLAP. Analytical queries score higher on the OLAP axis.
EXPLAIN SELECT sensor_id, AVG(value) FROM metrics GROUP BY sensor_id;Classification section in output:
Workload Classification: Type : OLAP OLTP Score : 0.20 OLAP Score : 0.80Point lookups and small updates will classify as OLTP:
EXPLAIN SELECT * FROM metrics WHERE id = 1;Workload Classification: Type : OLTP OLTP Score : 0.90 OLAP Score : 0.10Step 6 — Alter an Existing Column to Columnar
You can change the storage mode of an existing column without rewriting the table.
ALTER TABLE metrics ALTER COLUMN label SET STORAGE COLUMNAR;After this DDL, new writes to label will use columnar batches. Existing rows
are migrated lazily on the next compaction or vacuum.
Step 7 — Compression
HeliosDB uses Zstd compression by default for all stored data. You can
configure compression at the storage level in heliosdb.toml:
[storage]compression = "Zstd" # Options: Zstd, Lz4, NoneColumnar data compresses especially well because similar values (same type, same domain) are grouped together, giving Zstd and LZ4 better compression ratios than row-oriented layouts.
Tips and Troubleshooting
-
When to use columnar: Columns that are frequently aggregated, filtered in range scans, or rarely updated individually. Metrics, logs, and time-series data are ideal candidates.
-
When NOT to use columnar: Columns that are updated in-place frequently or accessed in single-row lookups. Keep those in default row storage.
-
Mixed workloads: HeliosDB’s HTAP design lets you mix columnar and row columns in the same table. OLTP-style point lookups use row-stored columns while OLAP-style aggregations benefit from columnar columns.
-
Batch size: Columnar batches hold 1024 values. Tables with fewer than 1024 rows in a column will still use columnar format but with partially filled batches.
-
EXPLAIN ANALYZE: Use
EXPLAIN ANALYZEto see per-operator timing and row counts, confirming that columnar scans are reading only the columns needed.
EXPLAIN ANALYZE SELECT AVG(value) FROM metrics;- Resource management: The HTAP resource manager assigns separate quotas for OLTP and OLAP workloads, preventing analytical queries from starving transactional ones.