Skip to content

Per-Column Storage Modes Guide

Per-Column Storage Modes Guide

HeliosDB Nano v3.6+

This guide explains how to use per-column storage modes to optimize storage and query performance for different data characteristics.


Overview

HeliosDB Nano provides four storage modes that can be applied on a per-column basis:

ModeSyntaxBest ForStorage Savings
Default(none)General OLTP workloadsBaseline
DictionarySTORAGE DICTIONARYLow-cardinality strings50-95%
Content-AddressedSTORAGE CONTENT_ADDRESSEDLarge values with duplicates~100% dedup
ColumnarSTORAGE COLUMNARAnalytics, aggregations20-50% better

All modes benefit from RocksDB’s built-in LZ4 block compression (3.7-7x automatic compression).


Quick Start

CREATE TABLE with Storage Modes

CREATE TABLE events (
id INT PRIMARY KEY,
event_type TEXT STORAGE DICTIONARY, -- ~50 unique event types
user_country TEXT STORAGE DICTIONARY, -- ~250 country codes
payload TEXT STORAGE CONTENT_ADDRESSED, -- Large JSON payloads
metrics FLOAT8[] STORAGE COLUMNAR, -- Analytics aggregations
timestamp INT8 STORAGE COLUMNAR, -- Time-series data
description TEXT -- Default storage
);

ALTER TABLE for Existing Data

-- Migrate existing column to dictionary encoding
ALTER TABLE events ALTER COLUMN event_type SET STORAGE DICTIONARY;
-- Migrate to content-addressed storage
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;
-- Migrate to columnar storage
ALTER TABLE events ALTER COLUMN metrics SET STORAGE COLUMNAR;
-- Revert to default
ALTER TABLE events ALTER COLUMN description SET STORAGE DEFAULT;

Important: ALTER TABLE ... SET STORAGE performs an online migration that restructures all existing data without downtime.


Storage Mode Details

1. Default Storage

Standard row-oriented bincode serialization stored directly in RocksDB.

Characteristics:

  • Values stored inline in row tuples
  • Best for OLTP point queries
  • Optimal for high-cardinality data
  • No additional overhead

When to use:

  • Primary keys and foreign keys
  • High-cardinality strings (UUIDs, unique identifiers)
  • Small numeric values
  • Data queried by primary key
CREATE TABLE users (
id INT PRIMARY KEY, -- Default: primary key lookups
email TEXT, -- Default: unique per user
created_at TIMESTAMP -- Default: small fixed-size
);

2. Dictionary Encoding (STORAGE DICTIONARY)

Maps string values to compact integer IDs using a per-column dictionary.

Characteristics:

  • Value → ID mapping stored separately
  • Each unique value stored once in dictionary
  • Row tuples contain only 4-byte IDs
  • Best for <64K unique values

Storage format:

dict:{table}:{column} → bincode HashMap<String, u32>
Row tuple: Value::DictRef { dict_id: u32 }

When to use:

  • Status codes (pending, approved, rejected)
  • Country/region codes (250 unique values)
  • Category names (product categories, tags)
  • Enum-like string columns

Storage savings:

CardinalityOriginalDictionarySavings
6 values10 bytes avg4 bytes60%
100 values15 bytes avg4 bytes73%
1000 values20 bytes avg4 bytes80%

Example:

CREATE TABLE orders (
id INT PRIMARY KEY,
status TEXT STORAGE DICTIONARY, -- 6 values: pending, processing, shipped, delivered, cancelled, returned
payment_method TEXT STORAGE DICTIONARY, -- 5 values: card, paypal, bank, crypto, cash
country TEXT STORAGE DICTIONARY -- ~200 country codes
);
-- 10M orders with repetitive status values
-- Without dictionary: ~150 MB for status column
-- With dictionary: ~40 MB + tiny dictionary overhead
-- Savings: ~73%

3. Content-Addressed Storage (STORAGE CONTENT_ADDRESSED)

Hash-based deduplication using Blake3 for large values (>1KB).

Characteristics:

  • Values hashed with Blake3 (fast cryptographic hash)
  • Identical values stored once regardless of how many rows reference them
  • Row tuples contain only 32-byte hash references
  • Automatic: only applies to values >= 1KB

Storage format:

cas:{blake3_hash} → original value bytes
Row tuple: Value::CasRef { hash: [u8; 32] }

When to use:

  • Document content (many emails use same template)
  • Profile photos/avatars (many users have default)
  • JSON payloads (API responses with common structures)
  • Large text fields with duplicates

Storage savings:

ScenarioDuplicate RateSavings
Email templates80% same~80%
Default avatars95% same~95%
Error messages99% same~99%

Example:

CREATE TABLE documents (
id INT PRIMARY KEY,
filename TEXT,
content TEXT STORAGE CONTENT_ADDRESSED, -- Many docs have same content
thumbnail BYTEA STORAGE CONTENT_ADDRESSED -- Many have default thumbnail
);
-- Insert 1M documents where 70% are duplicates of 100 templates
INSERT INTO documents VALUES (1, 'welcome.txt', 'Welcome to our service...');
INSERT INTO documents VALUES (2, 'terms.txt', 'Welcome to our service...'); -- Same content, stored once
-- Storage: 100 unique templates * 5KB each = 500KB
-- Without CAS: 1M * 5KB = 5GB
-- Savings: ~99.99%

Note: Small values (<1KB) are stored inline to avoid hash overhead exceeding storage benefit.


4. Columnar Storage (STORAGE COLUMNAR)

Column-grouped batch storage optimized for analytics workloads.

Characteristics:

  • Values grouped by column in batches of 1024
  • Same-type values stored together for better LZ4 compression
  • Efficient for aggregations (SUM, AVG, COUNT)
  • Better compression for numeric sequences

Storage format:

col:{table}:{column}:{batch_id} → bincode Vec<Value> (1024 values)
Row tuple: Value::ColumnarRef (placeholder)

When to use:

  • Time-series data (timestamps, metrics)
  • Aggregation-heavy analytics
  • Range scans over numeric columns
  • Data warehouse fact tables

Performance benefits:

OperationRow-orientedColumnarImprovement
SELECT SUM(amount)Scan all columnsScan one column3-5x faster
SELECT AVG(price)Deserialize tuplesRead batch2-4x faster
Range scanRandom I/OSequential I/O2-3x faster

Compression benefits:

Data PatternRow-oriented LZ4Columnar LZ4Improvement
Sequential INT83.96x5-8x+30-50%
Float sequences3.70x5-7x+35-50%
Timestamps3.96x6-10x+50-60%

Example:

CREATE TABLE metrics (
id INT PRIMARY KEY,
sensor_id INT,
timestamp INT8 STORAGE COLUMNAR, -- Sequential timestamps
temperature FLOAT8 STORAGE COLUMNAR, -- Sensor readings
pressure FLOAT8 STORAGE COLUMNAR, -- Similar numeric pattern
status TEXT STORAGE DICTIONARY -- Mix with dictionary
);
-- Analytics query benefits from columnar storage
SELECT
sensor_id,
AVG(temperature) as avg_temp,
MAX(pressure) as max_pressure
FROM metrics
WHERE timestamp > 1704067200000
GROUP BY sensor_id;
-- Only reads timestamp, temperature, pressure columns
-- Skips id, sensor_id, status data

Combining Storage Modes

You can use different storage modes for different columns in the same table:

CREATE TABLE e_commerce_events (
-- Default storage for keys
id INT PRIMARY KEY,
user_id INT,
-- Dictionary for low-cardinality
event_type TEXT STORAGE DICTIONARY, -- click, view, purchase, etc.
country TEXT STORAGE DICTIONARY, -- ~200 countries
device_type TEXT STORAGE DICTIONARY, -- mobile, desktop, tablet
-- Content-addressed for large duplicates
user_agent TEXT STORAGE CONTENT_ADDRESSED, -- Many users same browser
referrer_url TEXT STORAGE CONTENT_ADDRESSED, -- Common referrers
-- Columnar for analytics
timestamp INT8 STORAGE COLUMNAR, -- Time-series analysis
session_duration INT4 STORAGE COLUMNAR, -- Aggregations
page_views INT4 STORAGE COLUMNAR -- Aggregations
);

Online Migration

When you change a column’s storage mode, HeliosDB Nano performs an online migration:

  1. Scan existing rows one by one
  2. Decode from old format (if any)
  3. Encode to new format
  4. Write back to storage

Migration Performance

Source ModeTarget ModeRows/sec
Default → Dictionary250K-300K
Default → CAS200K-250K
Default → Columnar150K-200K
Dictionary → Default300K-350K
CAS → Default200K-250K

Example Migration

-- Check current storage (if supported)
SELECT column_name, storage_mode
FROM information_schema.columns
WHERE table_name = 'events';
-- Migrate status column (1M rows takes ~4 seconds)
ALTER TABLE events ALTER COLUMN status SET STORAGE DICTIONARY;
-- Migrate payload column (1M rows with 10KB payloads takes ~5 seconds)
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;

Best Practices

1. Choose Based on Data Characteristics

Data PatternRecommended Mode
<1000 unique valuesDICTIONARY
>50% duplicate large values (>1KB)CONTENT_ADDRESSED
Time-series or aggregation targetsCOLUMNAR
Everything elseDEFAULT

2. Don’t Over-Optimize

  • Default is fine for most columns
  • Only use specialized modes when there’s clear benefit
  • Profile your actual query patterns

3. Consider Query Patterns

Query PatternBest Mode
Point lookups by PKDEFAULT
Filter by categoryDICTIONARY
Full-text or document searchCONTENT_ADDRESSED
Aggregations (SUM, AVG, COUNT)COLUMNAR
Range scansCOLUMNAR

4. Monitor Storage Usage

-- Check table sizes (when available)
SELECT
table_name,
pg_total_relation_size(table_name) as total_size
FROM information_schema.tables
WHERE table_schema = 'public';

Storage Mode Reference

SQL Syntax

CREATE TABLE:

CREATE TABLE table_name (
column_name TYPE STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR},
...
);

ALTER TABLE:

ALTER TABLE table_name
ALTER COLUMN column_name
SET STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR};

Internal Value Types

pub enum Value {
// ... standard types ...
/// Dictionary-encoded string reference
DictRef { dict_id: u32 },
/// Content-addressed hash reference
CasRef { hash: [u8; 32] },
/// Placeholder for columnar storage
ColumnarRef,
}

Key Formats

ModeKey PatternValue
Dictionarydict:{table}:{column}Serialized dictionary
CAScas:{blake3_hash}Original bytes
Columnarcol:{table}:{column}:{batch_id}Batch of 1024 values

Troubleshooting

Dictionary Too Large

If you see warnings about dictionary size exceeding 64K entries:

-- Check cardinality before using dictionary
SELECT COUNT(DISTINCT status) FROM orders;
-- If >64K unique values, use DEFAULT instead
ALTER TABLE orders ALTER COLUMN status SET STORAGE DEFAULT;

CAS Not Deduplicating

Content-addressed storage only deduplicates values >= 1KB:

-- Check average value size
SELECT AVG(LENGTH(content)) FROM documents;
-- If <1KB, CAS won't help - use DEFAULT
ALTER TABLE documents ALTER COLUMN content SET STORAGE DEFAULT;

Slow Aggregations

If aggregations are slow on DEFAULT storage, consider COLUMNAR:

-- Profile query
EXPLAIN ANALYZE SELECT SUM(amount) FROM transactions;
-- If scanning many columns, switch to columnar
ALTER TABLE transactions ALTER COLUMN amount SET STORAGE COLUMNAR;

Performance Benchmarks

10GB Dataset Results

Storage ModeInsert RateStorage SizeCompression
Default + LZ4700K rows/sec2.7 GB3.96x
Dictionary + LZ4650K rows/sec1.2 GB8.9x
CAS (70% dupes)550K rows/sec0.8 GB13.4x
Columnar + LZ4600K rows/sec2.1 GB5.1x

Query Performance

Query TypeDefaultColumnarImprovement
Point lookup0.1ms0.3ms-3x (worse)
Full scan850ms320ms2.6x better
SUM aggregation450ms120ms3.7x better
AVG aggregation480ms130ms3.7x better


Last Updated: 2026-01-17 Version: HeliosDB Nano v3.6+