3-Tier Storage (Hot/Warm/Cold) User Guide
3-Tier Storage (Hot/Warm/Cold) User Guide
Overview
HeliosDB’s 3-tier storage automatically moves data between NVMe SSD (hot), SATA SSD (warm), and S3 (cold) tiers based on access patterns, achieving up to 80% cost reduction while maintaining query performance.
Benefits
- 80%+ cost reduction for large databases (100TB+)
- Automatic data tiering based on age, access frequency, and size
- Sub-millisecond hot tier performance
- Transparent to applications (no code changes)
- Configurable policies per table or partition
Storage Tiers
| Tier | Medium | Latency | Cost/GB | Use Case |
|---|---|---|---|---|
| Hot | NVMe SSD | <1ms | $0.15 | Active data (last 7 days) |
| Warm | SATA SSD | 1-5ms | $0.04 | Recent data (8-30 days) |
| Cold | S3 | 10-50ms | $0.02 | Archive (>30 days) |
Prerequisites
System Requirements
- HeliosDB v3.2 or later
- NVMe storage for hot tier (recommended 1TB+)
- SATA storage for warm tier (recommended 5TB+)
- S3-compatible object storage (AWS S3, MinIO, Ceph)
Storage Planning
Calculate storage needs:
Hot Tier: Active data (7 days) + 20% bufferWarm Tier: Recent data (30 days) + 10% bufferCold Tier: Everything else (unlimited)Step-by-Step Configuration
1. Configure Storage Tiers
Edit /etc/heliosdb/heliosdb.conf:
tiered_storage: enabled: true
# Hot Tier (NVMe) hot_tier: path: /mnt/nvme/heliosdb max_size_gb: 1000 latency_target_ms: 1 cost_per_gb: 0.15
# Warm Tier (SATA) warm_tier: path: /mnt/ssd/heliosdb max_size_gb: 5000 latency_target_ms: 5 cost_per_gb: 0.04
# Cold Tier (S3) cold_tier: type: s3 bucket: heliosdb-cold-tier region: us-east-1 endpoint: https://s3.amazonaws.com access_key_id: ${AWS_ACCESS_KEY_ID} # From environment secret_access_key: ${AWS_SECRET_ACCESS_KEY} latency_target_ms: 50 cost_per_gb: 0.02
# Tiering Policies policies: hot_to_warm: 7d # Move after 7 days warm_to_cold: 30d # Move after 30 days auto_tier: true check_interval: 3600s # Hourly evaluation
# Migration Settings migration: max_concurrent: 5 max_bandwidth_mbps: 100 atomic_transitions: true enable_throttling: true2. Set Up S3 Storage
AWS S3
# Create S3 bucketaws s3 mb s3://heliosdb-cold-tier --region us-east-1
# Enable versioning (optional)aws s3api put-bucket-versioning \ --bucket heliosdb-cold-tier \ --versioning-configuration Status=Enabled
# Configure lifecycle (optional archival to Glacier)aws s3api put-bucket-lifecycle-configuration \ --bucket heliosdb-cold-tier \ --lifecycle-configuration file://lifecycle.jsonlifecycle.json:
{ "Rules": [{ "Id": "Archive old data", "Status": "Enabled", "Transitions": [{ "Days": 90, "StorageClass": "GLACIER" }] }]}MinIO (Self-Hosted)
# heliosdb.conftiered_storage: cold_tier: type: s3 bucket: heliosdb-cold endpoint: http://minio:9000 access_key_id: minioadmin secret_access_key: minioadmin path_style: true # MinIO requires path-style3. Initialize Tiering
-- Enable tiering for databaseALTER DATABASE mydb SET tiering = 'enabled';
-- Check tier statusSELECT * FROM heliosdb.tier_status();Expected output:
tier_name | path | max_size_gb | used_gb | available_gb | status-----------+-------------------+-------------+---------+--------------+-------- hot | /mnt/nvme/... | 1000 | 124 | 876 | online warm | /mnt/ssd/... | 5000 | 892 | 4108 | online cold | s3://helios-cold | unlimited | 45000 | unlimited | onlineSQL Examples
Table-Level Tiering Policies
-- Create table with tiering policyCREATE TABLE events ( id SERIAL PRIMARY KEY, timestamp TIMESTAMPTZ, data JSONB) WITH ( tiering_policy = 'time_based', hot_to_warm = '7 days', warm_to_cold = '30 days');Partition-Level Policies
-- Time-series table with automatic partitioning and tieringCREATE TABLE metrics ( timestamp TIMESTAMPTZ, sensor_id INT, value DOUBLE PRECISION) PARTITION BY RANGE (timestamp);
-- Create monthly partitionsCREATE TABLE metrics_2025_10 PARTITION OF metrics FOR VALUES FROM ('2025-10-01') TO ('2025-11-01') WITH ( tiering_policy = 'time_based', hot_to_warm = '7 days', warm_to_cold = '30 days' );Manual Tier Assignment
-- Force table to specific tierALTER TABLE historical_data SET tier = 'cold';
-- Check current tierSELECT tablename, current_tier, size_mb, last_accessFROM heliosdb.table_tier_infoWHERE tablename = 'historical_data';Viewing Tiering Status
-- Cost analysis reportSELECT * FROM heliosdb.cost_analysis_report();Output:
{ "total_size_gb": 105000, "total_cost_monthly_usd": 2230, "breakdown": { "hot": {"size_gb": 1000, "cost_usd": 150}, "warm": {"size_gb": 5000, "cost_usd": 200}, "cold": {"size_gb": 99000, "cost_usd": 1880} }, "savings_vs_all_hot": { "baseline_cost_usd": 15000, "actual_cost_usd": 2230, "savings_usd": 12770, "savings_percent": 85.1 }}Migration Monitoring
-- View active migrationsSELECT object_key, from_tier, to_tier, progress_percent, estimated_completionFROM heliosdb.active_migrations;
-- Migration historySELECT timestamp, object_key, from_tier, to_tier, duration_ms, successFROM heliosdb.migration_historyWHERE timestamp > now() - interval '24 hours'ORDER BY timestamp DESCLIMIT 20;Common Use Cases
Use Case 1: Time-Series Data
Scenario: IoT sensor data, keep recent data fast, archive old data
CREATE TABLE sensor_readings ( timestamp TIMESTAMPTZ, sensor_id INT, temperature DOUBLE PRECISION, humidity DOUBLE PRECISION) PARTITION BY RANGE (timestamp)WITH ( tiering_policy = 'time_based', hot_to_warm = '7 days', warm_to_cold = '30 days');
-- Query automatically uses appropriate tierSELECT AVG(temperature)FROM sensor_readingsWHERE sensor_id = 123 AND timestamp > now() - interval '1 day'; -- Served from hot tier
SELECT AVG(temperature)FROM sensor_readingsWHERE sensor_id = 123 AND timestamp BETWEEN '2025-01-01' AND '2025-01-31'; -- Served from cold tierUse Case 2: Log Analytics
Scenario: Application logs with retention requirements
CREATE TABLE application_logs ( timestamp TIMESTAMPTZ, level VARCHAR(10), message TEXT, metadata JSONB) WITH ( tiering_policy = 'time_based', hot_to_warm = '3 days', -- Recent logs hot warm_to_cold = '14 days', -- Keep 2 weeks warm cold_retention = '365 days' -- Legal requirement);
-- Automatic cold tier pruning after 1 yearUse Case 3: Archive with Infrequent Access
Scenario: Customer data archives
CREATE TABLE customer_archives ( customer_id INT PRIMARY KEY, archived_date DATE, data JSONB) WITH ( tiering_policy = 'access_based', hot_to_warm = '1 access per 7 days', warm_to_cold = '1 access per 30 days');
-- Frequently accessed customers stay hot-- Inactive customers move to cold automaticallyTroubleshooting
Issue: Slow Cold Tier Queries
Symptom: Queries hitting cold tier are slow (>1 second)
Solution 1: Cache Frequently Accessed Data
-- Pin important historical data to warm tierALTER TABLE historical_data SET tier = 'warm' WHERE access_count > 10;Solution 2: Materialized Views
-- Create aggregated view in hot tierCREATE MATERIALIZED VIEW monthly_summaries ASSELECT DATE_TRUNC('month', timestamp) as month, AVG(value) as avg_value, COUNT(*) as countFROM cold_tier_tableGROUP BY monthWITH (tier = 'hot', refresh_interval = '1 day');Issue: Migration Bandwidth Saturation
Symptom: Network saturated during data migration
Solution: Adjust Throttling
# heliosdb.conftiered_storage: migration: max_bandwidth_mbps: 50 # Reduce from 100 max_concurrent: 3 # Reduce from 5Issue: Hot Tier Full
Symptom:
ERROR: Hot tier capacity exceeded (1050 GB / 1000 GB)Solution 1: Force Immediate Migration
-- Manually trigger migration to free spaceSELECT heliosdb.trigger_tier_migration('hot', 'warm', limit_gb => 200);Solution 2: Adjust Policy
-- Reduce hot tier retentionALTER TABLE my_table SET hot_to_warm = '5 days'; -- Was 7 daysSolution 3: Increase Capacity
# heliosdb.conftiered_storage: hot_tier: max_size_gb: 1500 # Increase from 1000Issue: S3 Access Errors
Symptom:
ERROR: Failed to access cold tier: Access DeniedDiagnosis:
# Test S3 accessaws s3 ls s3://heliosdb-cold-tier --region us-east-1
# Check IAM permissionsaws iam get-user-policy --user-name heliosdb --policy-name s3-accessSolution: Update IAM Policy
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::heliosdb-cold-tier", "arn:aws:s3:::heliosdb-cold-tier/*" ] }]}Performance Tuning
1. Optimize Tier Thresholds
-- Analyze access patternsSELECT tier_name, AVG(access_frequency) as avg_access_freq, AVG(query_latency_ms) as avg_latencyFROM heliosdb.tier_performance_metricsWHERE timestamp > now() - interval '7 days'GROUP BY tier_name;
-- Adjust based on findings-- If warm tier has high access frequency, keep data there longerALTER DATABASE mydb SET warm_to_cold = '45 days'; -- Increase from 302. Prefetch Optimization
# heliosdb.conftiered_storage: cold_tier: prefetch_enabled: true prefetch_size_mb: 10 # Prefetch adjacent blocks when accessing cold tier3. Compression Settings
tiered_storage: cold_tier: compression: zstd # Better compression, slightly slower compression_level: 3 # Balance compression vs speed4. Multipart Upload Tuning
tiered_storage: cold_tier: multipart_threshold_mb: 100 multipart_chunk_size_mb: 10 # Use multipart for large objectsBest Practices
1. Design for Tiering
-- Include timestamp for time-based policiesCREATE TABLE orders ( id SERIAL PRIMARY KEY, created_at TIMESTAMPTZ NOT NULL, -- Critical for tiering ...) WITH (tiering_policy = 'time_based');
-- Partition large tablesCREATE TABLE events ( event_time TIMESTAMPTZ, ...) PARTITION BY RANGE (event_time);2. Monitor Cost
-- Create monthly cost trackingCREATE TABLE tier_cost_history ASSELECT DATE_TRUNC('month', timestamp) as month, tier_name, AVG(size_gb) as avg_size_gb, AVG(cost_usd) as avg_cost_usdFROM heliosdb.tier_metricsGROUP BY month, tier_name;
-- Set up cost alertsCREATE FUNCTION check_tier_costs() RETURNS TRIGGER AS $$BEGIN IF NEW.total_cost_usd > 3000 THEN RAISE WARNING 'Monthly tier costs exceed $3000: $%', NEW.total_cost_usd; END IF; RETURN NEW;END;$$ LANGUAGE plpgsql;3. Test Tiering Policies
-- Create test environmentCREATE SCHEMA tier_test;
-- Test table with aggressive tieringCREATE TABLE tier_test.sample ( timestamp TIMESTAMPTZ, data TEXT) WITH ( tiering_policy = 'time_based', hot_to_warm = '1 hour', -- Fast testing warm_to_cold = '3 hours');
-- Insert test dataINSERT INTO tier_test.sampleSELECT now() - interval '1 hour' * i, 'test data'FROM generate_series(1, 100) i;
-- Wait and verify tier transitionsSELECT current_tier, COUNT(*)FROM heliosdb.object_tier_infoWHERE table_name = 'tier_test.sample'GROUP BY current_tier;4. Handle Cold Tier Latency
# Application code with cachingfrom functools import lru_cacheimport psycopg2
@lru_cache(maxsize=1000)def fetch_historical_data(customer_id, date): """Cache cold tier queries""" conn = psycopg2.connect(...) cursor = conn.cursor() cursor.execute(""" SELECT * FROM customer_archives WHERE customer_id = %s AND archived_date = %s """, (customer_id, date)) return cursor.fetchall()5. Graceful Degradation
-- Create fallback views for cold tier unavailabilityCREATE VIEW recent_data_view ASSELECT * FROM all_data WHERE timestamp > now() - interval '30 days'UNION ALLSELECT * FROM cached_cold_data; -- Stale cache for emergenciesAdvanced Topics
Custom Tiering Policies
-- Create custom policy based on access frequencyCREATE TIERING POLICY frequent_access AS WHEN access_count > 100 THEN 'hot' WHEN access_count > 10 THEN 'warm' ELSE 'cold';
-- Apply to tableALTER TABLE my_table SET tiering_policy = frequent_access;Lifecycle Management
-- Automatic deletion from cold tier after retention periodCREATE TABLE archived_logs ( ...) WITH ( tiering_policy = 'time_based', cold_retention = '365 days', auto_delete_after_retention = true);Cross-Region Cold Tier
tiered_storage: cold_tier: type: s3 # Multi-region replication regions: - name: us-east-1 bucket: heliosdb-cold-us-east-1 primary: true - name: eu-west-1 bucket: heliosdb-cold-eu-west-1 replica: trueMonitoring
Key Metrics
-- Tier performance dashboardSELECT tier_name, AVG(latency_p50_ms) as p50_latency, AVG(latency_p95_ms) as p95_latency, AVG(latency_p99_ms) as p99_latency, AVG(throughput_mbps) as avg_throughput, AVG(iops) as avg_iopsFROM heliosdb.tier_performanceWHERE timestamp > now() - interval '1 hour'GROUP BY tier_name;Prometheus Metrics
scrape_configs: - job_name: 'heliosdb_tiering' static_configs: - targets: ['localhost:9090'] metrics_path: '/metrics/tiering'Exported metrics:
heliosdb_tier_size_bytes{tier="hot|warm|cold"}heliosdb_tier_used_bytes{tier="hot|warm|cold"}heliosdb_tier_latency_seconds{tier="hot|warm|cold",percentile="50|95|99"}heliosdb_tier_migrations_total{from="X",to="Y"}heliosdb_tier_cost_usd{tier="hot|warm|cold"}
Conclusion
3-tier storage in HeliosDB provides automatic, transparent data tiering with significant cost savings. By following this guide, you can achieve 80%+ cost reduction while maintaining excellent query performance.
Key Takeaways:
- Hot tier (<1ms) for active data
- Warm tier (1-5ms) for recent data
- Cold tier (10-50ms) for archives
- Automatic tiering based on configurable policies
- 80%+ cost savings for large databases
For more information:
- API Reference:
/docs/api/tiering.md - Architecture:
/docs/architecture/tiered-storage.md - S3 Integration:
/docs/integrations/s3-configuration.md