Per-Column Storage Modes Guide
Per-Column Storage Modes Guide
HeliosDB Nano v3.6+
This guide explains how to use per-column storage modes to optimize storage and query performance for different data characteristics.
Overview
HeliosDB Nano provides four storage modes that can be applied on a per-column basis:
| Mode | Syntax | Best For | Storage Savings |
|---|---|---|---|
| Default | (none) | General OLTP workloads | Baseline |
| Dictionary | STORAGE DICTIONARY | Low-cardinality strings | 50-95% |
| Content-Addressed | STORAGE CONTENT_ADDRESSED | Large values with duplicates | ~100% dedup |
| Columnar | STORAGE COLUMNAR | Analytics, aggregations | 20-50% better |
All modes benefit from RocksDB’s built-in LZ4 block compression (3.7-7x automatic compression).
Quick Start
CREATE TABLE with Storage Modes
CREATE TABLE events ( id INT PRIMARY KEY, event_type TEXT STORAGE DICTIONARY, -- ~50 unique event types user_country TEXT STORAGE DICTIONARY, -- ~250 country codes payload TEXT STORAGE CONTENT_ADDRESSED, -- Large JSON payloads metrics FLOAT8[] STORAGE COLUMNAR, -- Analytics aggregations timestamp INT8 STORAGE COLUMNAR, -- Time-series data description TEXT -- Default storage);ALTER TABLE for Existing Data
-- Migrate existing column to dictionary encodingALTER TABLE events ALTER COLUMN event_type SET STORAGE DICTIONARY;
-- Migrate to content-addressed storageALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;
-- Migrate to columnar storageALTER TABLE events ALTER COLUMN metrics SET STORAGE COLUMNAR;
-- Revert to defaultALTER TABLE events ALTER COLUMN description SET STORAGE DEFAULT;Important: ALTER TABLE ... SET STORAGE performs an online migration that restructures all existing data without downtime.
Storage Mode Details
1. Default Storage
Standard row-oriented bincode serialization stored directly in RocksDB.
Characteristics:
- Values stored inline in row tuples
- Best for OLTP point queries
- Optimal for high-cardinality data
- No additional overhead
When to use:
- Primary keys and foreign keys
- High-cardinality strings (UUIDs, unique identifiers)
- Small numeric values
- Data queried by primary key
CREATE TABLE users ( id INT PRIMARY KEY, -- Default: primary key lookups email TEXT, -- Default: unique per user created_at TIMESTAMP -- Default: small fixed-size);2. Dictionary Encoding (STORAGE DICTIONARY)
Maps string values to compact integer IDs using a per-column dictionary.
Characteristics:
- Value → ID mapping stored separately
- Each unique value stored once in dictionary
- Row tuples contain only 4-byte IDs
- Best for <64K unique values
Storage format:
dict:{table}:{column} → bincode HashMap<String, u32>Row tuple: Value::DictRef { dict_id: u32 }When to use:
- Status codes (
pending,approved,rejected) - Country/region codes (250 unique values)
- Category names (product categories, tags)
- Enum-like string columns
Storage savings:
| Cardinality | Original | Dictionary | Savings |
|---|---|---|---|
| 6 values | 10 bytes avg | 4 bytes | 60% |
| 100 values | 15 bytes avg | 4 bytes | 73% |
| 1000 values | 20 bytes avg | 4 bytes | 80% |
Example:
CREATE TABLE orders ( id INT PRIMARY KEY, status TEXT STORAGE DICTIONARY, -- 6 values: pending, processing, shipped, delivered, cancelled, returned payment_method TEXT STORAGE DICTIONARY, -- 5 values: card, paypal, bank, crypto, cash country TEXT STORAGE DICTIONARY -- ~200 country codes);
-- 10M orders with repetitive status values-- Without dictionary: ~150 MB for status column-- With dictionary: ~40 MB + tiny dictionary overhead-- Savings: ~73%3. Content-Addressed Storage (STORAGE CONTENT_ADDRESSED)
Hash-based deduplication using Blake3 for large values (>1KB).
Characteristics:
- Values hashed with Blake3 (fast cryptographic hash)
- Identical values stored once regardless of how many rows reference them
- Row tuples contain only 32-byte hash references
- Automatic: only applies to values >= 1KB
Storage format:
cas:{blake3_hash} → original value bytesRow tuple: Value::CasRef { hash: [u8; 32] }When to use:
- Document content (many emails use same template)
- Profile photos/avatars (many users have default)
- JSON payloads (API responses with common structures)
- Large text fields with duplicates
Storage savings:
| Scenario | Duplicate Rate | Savings |
|---|---|---|
| Email templates | 80% same | ~80% |
| Default avatars | 95% same | ~95% |
| Error messages | 99% same | ~99% |
Example:
CREATE TABLE documents ( id INT PRIMARY KEY, filename TEXT, content TEXT STORAGE CONTENT_ADDRESSED, -- Many docs have same content thumbnail BYTEA STORAGE CONTENT_ADDRESSED -- Many have default thumbnail);
-- Insert 1M documents where 70% are duplicates of 100 templatesINSERT INTO documents VALUES (1, 'welcome.txt', 'Welcome to our service...');INSERT INTO documents VALUES (2, 'terms.txt', 'Welcome to our service...'); -- Same content, stored once
-- Storage: 100 unique templates * 5KB each = 500KB-- Without CAS: 1M * 5KB = 5GB-- Savings: ~99.99%Note: Small values (<1KB) are stored inline to avoid hash overhead exceeding storage benefit.
4. Columnar Storage (STORAGE COLUMNAR)
Column-grouped batch storage optimized for analytics workloads.
Characteristics:
- Values grouped by column in batches of 1024
- Same-type values stored together for better LZ4 compression
- Efficient for aggregations (SUM, AVG, COUNT)
- Better compression for numeric sequences
Storage format:
col:{table}:{column}:{batch_id} → bincode Vec<Value> (1024 values)Row tuple: Value::ColumnarRef (placeholder)When to use:
- Time-series data (timestamps, metrics)
- Aggregation-heavy analytics
- Range scans over numeric columns
- Data warehouse fact tables
Performance benefits:
| Operation | Row-oriented | Columnar | Improvement |
|---|---|---|---|
| SELECT SUM(amount) | Scan all columns | Scan one column | 3-5x faster |
| SELECT AVG(price) | Deserialize tuples | Read batch | 2-4x faster |
| Range scan | Random I/O | Sequential I/O | 2-3x faster |
Compression benefits:
| Data Pattern | Row-oriented LZ4 | Columnar LZ4 | Improvement |
|---|---|---|---|
| Sequential INT8 | 3.96x | 5-8x | +30-50% |
| Float sequences | 3.70x | 5-7x | +35-50% |
| Timestamps | 3.96x | 6-10x | +50-60% |
Example:
CREATE TABLE metrics ( id INT PRIMARY KEY, sensor_id INT, timestamp INT8 STORAGE COLUMNAR, -- Sequential timestamps temperature FLOAT8 STORAGE COLUMNAR, -- Sensor readings pressure FLOAT8 STORAGE COLUMNAR, -- Similar numeric pattern status TEXT STORAGE DICTIONARY -- Mix with dictionary);
-- Analytics query benefits from columnar storageSELECT sensor_id, AVG(temperature) as avg_temp, MAX(pressure) as max_pressureFROM metricsWHERE timestamp > 1704067200000GROUP BY sensor_id;-- Only reads timestamp, temperature, pressure columns-- Skips id, sensor_id, status dataCombining Storage Modes
You can use different storage modes for different columns in the same table:
CREATE TABLE e_commerce_events ( -- Default storage for keys id INT PRIMARY KEY, user_id INT,
-- Dictionary for low-cardinality event_type TEXT STORAGE DICTIONARY, -- click, view, purchase, etc. country TEXT STORAGE DICTIONARY, -- ~200 countries device_type TEXT STORAGE DICTIONARY, -- mobile, desktop, tablet
-- Content-addressed for large duplicates user_agent TEXT STORAGE CONTENT_ADDRESSED, -- Many users same browser referrer_url TEXT STORAGE CONTENT_ADDRESSED, -- Common referrers
-- Columnar for analytics timestamp INT8 STORAGE COLUMNAR, -- Time-series analysis session_duration INT4 STORAGE COLUMNAR, -- Aggregations page_views INT4 STORAGE COLUMNAR -- Aggregations);Online Migration
When you change a column’s storage mode, HeliosDB Nano performs an online migration:
- Scan existing rows one by one
- Decode from old format (if any)
- Encode to new format
- Write back to storage
Migration Performance
| Source Mode | Target Mode | Rows/sec |
|---|---|---|
| Default → Dictionary | 250K-300K | |
| Default → CAS | 200K-250K | |
| Default → Columnar | 150K-200K | |
| Dictionary → Default | 300K-350K | |
| CAS → Default | 200K-250K |
Example Migration
-- Check current storage (if supported)SELECT column_name, storage_modeFROM information_schema.columnsWHERE table_name = 'events';
-- Migrate status column (1M rows takes ~4 seconds)ALTER TABLE events ALTER COLUMN status SET STORAGE DICTIONARY;
-- Migrate payload column (1M rows with 10KB payloads takes ~5 seconds)ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;Best Practices
1. Choose Based on Data Characteristics
| Data Pattern | Recommended Mode |
|---|---|
| <1000 unique values | DICTIONARY |
| >50% duplicate large values (>1KB) | CONTENT_ADDRESSED |
| Time-series or aggregation targets | COLUMNAR |
| Everything else | DEFAULT |
2. Don’t Over-Optimize
- Default is fine for most columns
- Only use specialized modes when there’s clear benefit
- Profile your actual query patterns
3. Consider Query Patterns
| Query Pattern | Best Mode |
|---|---|
| Point lookups by PK | DEFAULT |
| Filter by category | DICTIONARY |
| Full-text or document search | CONTENT_ADDRESSED |
| Aggregations (SUM, AVG, COUNT) | COLUMNAR |
| Range scans | COLUMNAR |
4. Monitor Storage Usage
-- Check table sizes (when available)SELECT table_name, pg_total_relation_size(table_name) as total_sizeFROM information_schema.tablesWHERE table_schema = 'public';Storage Mode Reference
SQL Syntax
CREATE TABLE:
CREATE TABLE table_name ( column_name TYPE STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR}, ...);ALTER TABLE:
ALTER TABLE table_nameALTER COLUMN column_nameSET STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR};Internal Value Types
pub enum Value { // ... standard types ...
/// Dictionary-encoded string reference DictRef { dict_id: u32 },
/// Content-addressed hash reference CasRef { hash: [u8; 32] },
/// Placeholder for columnar storage ColumnarRef,}Key Formats
| Mode | Key Pattern | Value |
|---|---|---|
| Dictionary | dict:{table}:{column} | Serialized dictionary |
| CAS | cas:{blake3_hash} | Original bytes |
| Columnar | col:{table}:{column}:{batch_id} | Batch of 1024 values |
Troubleshooting
Dictionary Too Large
If you see warnings about dictionary size exceeding 64K entries:
-- Check cardinality before using dictionarySELECT COUNT(DISTINCT status) FROM orders;
-- If >64K unique values, use DEFAULT insteadALTER TABLE orders ALTER COLUMN status SET STORAGE DEFAULT;CAS Not Deduplicating
Content-addressed storage only deduplicates values >= 1KB:
-- Check average value sizeSELECT AVG(LENGTH(content)) FROM documents;
-- If <1KB, CAS won't help - use DEFAULTALTER TABLE documents ALTER COLUMN content SET STORAGE DEFAULT;Slow Aggregations
If aggregations are slow on DEFAULT storage, consider COLUMNAR:
-- Profile queryEXPLAIN ANALYZE SELECT SUM(amount) FROM transactions;
-- If scanning many columns, switch to columnarALTER TABLE transactions ALTER COLUMN amount SET STORAGE COLUMNAR;Performance Benchmarks
10GB Dataset Results
| Storage Mode | Insert Rate | Storage Size | Compression |
|---|---|---|---|
| Default + LZ4 | 700K rows/sec | 2.7 GB | 3.96x |
| Dictionary + LZ4 | 650K rows/sec | 1.2 GB | 8.9x |
| CAS (70% dupes) | 550K rows/sec | 0.8 GB | 13.4x |
| Columnar + LZ4 | 600K rows/sec | 2.1 GB | 5.1x |
Query Performance
| Query Type | Default | Columnar | Improvement |
|---|---|---|---|
| Point lookup | 0.1ms | 0.3ms | -3x (worse) |
| Full scan | 850ms | 320ms | 2.6x better |
| SUM aggregation | 450ms | 120ms | 3.7x better |
| AVG aggregation | 480ms | 130ms | 3.7x better |
Related Documentation
Last Updated: 2026-01-17 Version: HeliosDB Nano v3.6+