HeliosDB Resource Limits Reference

A comprehensive guide to understanding, configuring, and managing resource limits in HeliosDB production deployments. This reference covers system-level limits, database-level limits, capacity planning, and optimization strategies for different workloads.

Quick Reference Tables
System-Level Resource Limits
Database-Level Resource Limits
Configuration Examples
Capacity Planning Guide
Monitoring Resource Usage
Workload-Specific Tuning
Best Practices

Quick Reference Tables

System-Level Limits Summary

Resource	Default	Small Deployment	Medium Deployment	Large Deployment	Impact
Memory	8GB	4GB	16GB	64GB+	Query performance, Cache hit rate
Max Connections	10,000	500	5,000	20,000+	Concurrent users, Connection overhead
Connections per User	100	10	50	200	Application concurrency
File Descriptors	10,000	2,500	25,000	65,000+	Parallel I/O operations
CPU Cores	Auto-detect	2-4	8-16	32+	Parallelism, Query throughput
Disk Space	500GB	100GB	1TB	10TB+	Data retention, Compaction overhead

Database-Level Limits Summary

Resource	Default	Max	Configurable	Impact
Query Timeout	30 seconds	5 minutes	Per-query	Query response time
Query Memory Limit	1GB	Configurable	Yes	Large joins, aggregations
Connection Memory	512MB	Configurable	Yes	Connection overhead
Total Query Memory	8GB	Configurable	Yes	Concurrent query capacity
Max Result Rows	1,000,000	Configurable	Yes	Memory usage, Client buffering
Statement Complexity Limit	10,000	Configurable	Yes	Optimizer performance
Transaction Size Limit	2GB	Configurable	Yes	Memory, Log files
Index Size Limit	Unlimited	Configurable	Yes	Disk space, Build time

System-Level Resource Limits

Memory Limits

Buffer Pool Configuration

The buffer pool manages in-memory storage of data blocks and caches query results.

Default Configuration:

Buffer Pool Size: 50% of total system memory (4GB on 8GB system)
Page Size: 16KB
Max Pages: Dynamic allocation
Eviction Policy: LRU (Least Recently Used)

Recommended Values:

# Small Deployment (4GB RAM total)
[storage]
buffer_pool_mb = 2048  # 2GB
page_size_kb = 16
eviction_policy = "lru"

# Medium Deployment (16GB RAM total)
[storage]
buffer_pool_mb = 8192  # 8GB
page_size_kb = 16
eviction_policy = "lru"

# Large Deployment (64GB+ RAM total)
[storage]
buffer_pool_mb = 32768  # 32GB
page_size_kb = 16
eviction_policy = "arc"  # Adaptive Replacement Cache

Performance Impact:

Too Small (<2GB): High disk I/O, reduced cache hit rates, slower queries
Too Large (>80% RAM): OOM errors, system swapping, unstable performance
Optimal: 60-75% of available RAM, cache hit rate >90%

Warning Signs:

Cache hit rate drops below 70%
Disk I/O utilization >80%
Page eviction rate increases significantly
Query latency increases unexpectedly

Working Memory

Memory allocated for query execution (sorting, hashing, temporary tables).

Default Configuration:

Per Query: 1GB
Per Connection: 512MB
Total Limit: 8GB

Recommended Values:

Workload	Per Query	Per Connection	Recommended Total
OLTP	256MB	128MB	2GB
OLAP	2-4GB	512MB	16GB
Mixed	1GB	256MB	8GB
Edge/IoT	64MB	32MB	512MB

Configuration:

// Rust API
use heliosdb_common::MemoryLimits;

let limits = MemoryLimits {
    per_query_mb: 1024,
    per_connection_mb: 512,
    total_mb: 8192,
    pressure_threshold: 0.80,
};

Connection Limits

Maximum Connections

Total number of simultaneous database connections.

Default: 10,000 connections

Considerations:

Each connection consumes ~1-2MB RAM for connection context
File descriptor (FD) limit must be higher than connection limit
Connection pooling can reduce actual connection count

Recommended Values:

# Small Deployment - 500 Users
max_connections = 500
max_connections_per_user = 10
idle_timeout_seconds = 600

# Medium Deployment - 5,000 Users
max_connections = 5000
max_connections_per_user = 50
idle_timeout_seconds = 600

# Large Deployment - 50,000 Users
max_connections = 20000  # Use connection pooling
max_connections_per_user = 200
idle_timeout_seconds = 900

Connection Memory Calculation:

Total Connection Memory = max_connections * connection_context_size
Typical: 10,000 connections * 1.5MB = ~15GB overhead

Warning Signs:

“Too many connections” errors
Connection acquisition timeout errors
Memory usage increases unexpectedly during high-load
Idle connections not being cleaned up

Per-User Connection Limits

Prevents single user/application from consuming all connections.

Default: 100 connections per user

Recommended:

Development: 5-10 per user
Testing: 20-50 per user
Production: 50-200 per user

File Descriptor Limits

System-Level FD Limits

Every database file, socket, and pipe requires a file descriptor.

Important: OS-level limits must exceed HeliosDB limits.

Check current limits:

# View soft and hard limits
ulimit -Sn  # Soft limit
ulimit -Hn  # Hard limit

# Recommended permanent configuration
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 131072" | sudo tee -a /etc/security/limits.conf

Database-Level Configuration:

# Small Deployment
max_open_files = 2500
file_handle_timeout_seconds = 300

# Medium Deployment
max_open_files = 25000
file_handle_timeout_seconds = 600

# Large Deployment
max_open_files = 65536
file_handle_timeout_seconds = 900

FD Consumption Breakdown:

Category	Count	Notes
Connections	`max_connections`	1-2 FDs per connection
SSTable Files	`memtable_count * 5`	5-10 FDs per compaction level
WAL (Write-Ahead Log)	10-20	Rotating log files
Temporary Files	Variable	Query temp files, index builds
System FDs	100-200	Network, pipes, internal

Calculation Example:

10,000 connections × 1.5 FDs = 15,000
1,000 SSTable files × 1 FD = 1,000
100 temporary files = 100
System overhead = 200
─────────────────────────────
Typical requirement: 16,300 FDs
Recommended limit: 25,000+ FDs

Warning Signs:

“Too many open files” errors
Query execution fails with file access errors
Compaction operations stall
Unable to open new database files

Disk Space Considerations

Data Storage Requirements

Includes table data, indexes, WAL, and compaction overhead.

Estimation Formula:

Total Disk = (Raw Data Size × Compression Factor)
           + (Indexes × Index Overhead)
           + (WAL × Log Retention)
           + (Temp Space × 25%)

Compression Factors (HeliosDB defaults):

Structured data: 5-10x compression
Numeric data: 10-100x compression (with delta/dictionary encoding)
Text data: 2-3x compression
No compression: 1x

Index Overhead:

B-Tree indexes: 10-20% of table size
HNSW vector indexes: 5-15% per index
Multiple indexes on same table: 50-100% total overhead

Example Calculation (Medium Deployment):

Raw Data Size: 1TB
Compression Factor: 7x
Compressed Size: 143GB

Primary Table: 143GB
B-Tree Indexes (3 indexes): 42GB (30% of table)
Vector Index (HNSW): 24GB (20% of table)
WAL (30-day retention): 15GB
Temporary Space (25%): 51GB
─────────────────────────
Total Disk Requirement: 275GB
Recommended Allocation: 400GB (with 45% headroom)

Recommended Disk Space Headroom:

Small deployments: 30-50% free space
Medium deployments: 25-30% free space
Large deployments: 20-25% free space
Production critical: Never go below 20% free

Warning Signs:

Disk usage >80%
Compaction operations slow down
Index creation fails
WAL archival delays
Query performance degrades

CPU Core Utilization

Parallelism Configuration

HeliosDB automatically scales to available CPU cores.

Default: Auto-detect all available cores

Configuration Options:

# Small Deployment - 2-4 cores
[compute]
max_parallel_tasks = 4
worker_threads = 4
io_threads = 2

# Medium Deployment - 8-16 cores
[compute]
max_parallel_tasks = 16
worker_threads = 16
io_threads = 4

# Large Deployment - 32+ cores
[compute]
max_parallel_tasks = 64
worker_threads = 32
io_threads = 8

Core Allocation Strategy:

System Cores	Query Workers	I/O Workers	Compaction Workers
2-4	2	1	1
8-16	8	2	2
32-64	28	4	4
64+	N-4	4	4

Consideration: Hyper-Threading

Logical cores = Physical cores × 2 (with HT)
For database workloads, use logical core count
Monitor CPU utilization; >90% sustained indicates bottleneck

Warning Signs:

CPU utilization consistently <30% (underutilization)
CPU utilization consistently >95% (bottleneck)
Query latency increases with load (insufficient parallelism)
Context switches too high (oversubscription)

Database-Level Resource Limits

Query Execution Limits

Query Timeout

Maximum time a query can execute before automatic cancellation.

Default: 30 seconds

Configuration:

// Global configuration
let config = QueryConfig {
    timeout_ms: 30_000,        // 30 seconds default
    max_timeout_ms: 300_000,   // 5 minute absolute max
    auto_cancel: true,
};

// Per-query override
let context = QueryContext::new(user)
    .with_timeout(120_000);  // 2 minutes for this query

// Via SQL
SET SESSION query_timeout = 60000;  -- 60 seconds

Recommended Values:

Query Type	Timeout	Max Timeout	Notes
Simple SELECT	5s	30s	Primary key lookup
Analytical JOIN	30s	300s	Requires more I/O
Complex Aggregation	60s	600s	Full scan, heavy computation
Batch ETL	300s	3600s	Large data movement
Streaming	0	0	Timeout not applicable

Configuration Impact:

Too Short: Valid queries abort, user frustration, incomplete results
Too Long: Resource exhaustion, cascading failures, poor user experience
Optimal: Balanced for workload, 90th percentile query time + 50% buffer

Query Memory Limit

Maximum memory a single query can allocate.

Default: 1GB per query

Configuration:

[query]
max_memory_mb = 1024  # 1GB default
max_memory_per_connection_mb = 512

# Override per connection
SET SESSION work_mem = '2GB';

Memory Usage Scenarios:

Scenario	Typical Usage	Recommended Limit
Simple SELECT	10-50MB	256MB
Join 2-3 tables	100-500MB	1GB
Complex JOIN 5+ tables	500MB-2GB	4GB
Aggregation on 1B rows	1-4GB	8GB
Window functions	500MB-2GB	4GB

Out-of-Memory Behavior:

// When query exceeds memory limit:
// 1. Attempt disk spillover (temp files)
// 2. If temp space exhausted, abort query
// 3. Return MemoryExceeded error

// Configuration: Enable spillover
[query]
enable_temp_spill = true
max_temp_spill_mb = 10240  # 10GB temp space

Warning Signs:

Queries consistently hit memory limit
Disk I/O spikes during large queries
OOM errors in logs
Query performance degrades with data size growth

Result Set Size Limits

Maximum rows in query result set.

Default: 1,000,000 rows

Configuration:

[query]
max_result_rows = 1000000

# SQL override
SET SESSION result_limit = 5000000;

Impact on Different Layers:

Component	Impact	Calculation
Network Transfer	Bandwidth usage	1M rows × 100 bytes/row = 100MB
Client Buffering	Client memory	1M rows × 100 bytes = 100MB
Server Memory	Result storage	Varies by serialization
Wire Protocol	Encoding overhead	~20% additional

Recommended Values:

REST/HTTP API: 10,000-100,000 rows (minimize network traffic)
Direct connection: 100,000-1,000,000 rows
Batch exports: 1,000,000+ rows (implement pagination)

Statement Complexity Limits

Query Optimization Limits

Prevents optimizer from spending excessive time on complex queries.

Default: 10,000 optimization operations

Configuration:

[optimizer]
max_optimization_operations = 10000
max_join_order_permutations = 1000
max_plan_cache_entries = 10000

Complexity Indicators:

Operation Count = JOINs × CTE_count × UNION_count × subqueries

Examples:

Simple SELECT: 1-10 operations
5-table JOIN: 100-500 operations
10-table JOIN: 1000-5000 operations
Query with 10+ CTEs: >10,000 operations

Performance Impact:

Exceeded: Falls back to simple plan, suboptimal execution
Well-tuned: Query optimized before 500ms, good performance
Warning: Optimization takes >2 seconds for query <500MB data

Transaction Size Limits

Maximum Transaction Size

Total data modification per transaction.

Default: 2GB per transaction

Configuration:

[transactions]
max_transaction_size_mb = 2048

# SQL level
BEGIN TRANSACTION;
-- INSERT/UPDATE/DELETE operations
-- Total writes must be < 2GB
COMMIT;

Memory Impact Calculation:

Transaction Memory = INSERT rows × avg_row_size
                   + UPDATE rows × avg_row_size
                   + Index overhead

Example:

Inserting 10M rows × 500 bytes = 5GB data
Index overhead (3 indexes × 15%) = 2.25GB
Lock info = 1GB
─────────────────────────────────
Total: ~8.25GB memory required

Recommendations:

Workload	Max Size	Batch Size	Impact
OLTP	100MB	1M rows	Fast commits
OLAP	2GB+	Larger batches	Higher resource use
Streaming	500MB	Continuous	Balanced
ETL	1GB	Job-specific	Memory + I/O

WAL (Write-Ahead Log) Impact:

WAL Size ≈ Transaction Size × 1.2x
30-day retention: 5GB/day transaction rate = 150GB WAL

Index Size Limits

Per-Index Limits

HeliosDB supports unlimited index size by default, but consider:

Memory Impact During Index Builds:

Index Build Memory = Data Size × 2-3x
Build Time ≈ (Data Size / (Cores × 100MB/s))

Examples:

Index Type	Data Size	Build Memory	Build Time (8 cores)
B-Tree	100GB	200-300GB	15-25 minutes
HNSW Vector	10GB (1M vectors)	30-50GB	30-60 minutes
Inverted Text	50GB	100-150GB	10-20 minutes
BRIN (Block Range)	100GB	5-10GB	2-5 minutes

Recommendation: Set advisory limits

[storage.index_limits]
max_bree_index_size_gb = 500
max_vector_index_size_gb = 100
max_text_index_size_gb = 200

# Triggers warning but doesn't block
warning_threshold_percent = 80

Warning Signs:

Index build runs out of memory
Index build takes >1 hour
Queries not using indexes due to size
Random-access performance degradation

Configuration Examples

Example 1: Small Deployment (Development/Testing)

# Small HeliosDB Configuration (4GB RAM, 4 cores)

[storage]
data_dir = "/var/lib/heliosdb/data"
memtable_size_mb = 64
sstable_size_mb = 128
buffer_pool_mb = 2048
compaction_strategy = "size_tiered"

[network]
listen_addr = "0.0.0.0:5432"
max_message_size_mb = 64
max_connections = 500

[compute]
worker_threads = 4
io_threads = 2
max_parallel_tasks = 4

[query]
timeout_ms = 30000
max_memory_mb = 512
max_result_rows = 100000

[resources]
max_connections = 500
max_connections_per_user = 10
max_open_files = 2500
idle_timeout_seconds = 300

[memory]
total_limit_mb = 3000
query_limit_mb = 512
connection_limit_mb = 256

Expected Performance:

Throughput: 100-500 ops/sec
Latency: 10-100ms p95
Concurrent users: 10-20
Max data: 10-50GB

Example 2: Medium Production Deployment

# Medium HeliosDB Configuration (16GB RAM, 16 cores)

[storage]
data_dir = "/var/lib/heliosdb/data"
memtable_size_mb = 256
sstable_size_mb = 512
buffer_pool_mb = 8192
compaction_strategy = "leveled"
compaction_level_multiplier = 10

[network]
listen_addr = "0.0.0.0:5432"
rdma_enabled = false
max_message_size_mb = 128
max_connections = 5000

[compute]
worker_threads = 16
io_threads = 4
max_parallel_tasks = 16

[query]
timeout_ms = 60000
max_memory_mb = 2048
max_result_rows = 1000000
enable_temp_spill = true
max_temp_spill_mb = 5120

[resources]
max_connections = 5000
max_connections_per_user = 50
max_open_files = 25000
idle_timeout_seconds = 600
connection_memory_mb = 512

[memory]
total_limit_mb = 12000
query_limit_mb = 2048
connection_limit_mb = 512

[autoscale]
enabled = true
min_cu = 10.0
max_cu = 100.0
target_cpu_percent = 70.0
scale_up_threshold = 80.0
scale_down_threshold = 30.0

Expected Performance:

Throughput: 10,000-50,000 ops/sec
Latency: 5-20ms p95
Concurrent users: 100-500
Max data: 500GB-2TB

Example 3: Large OLAP Deployment

# Large HeliosDB OLAP Configuration (64GB RAM, 32 cores)

[storage]
data_dir = "/var/lib/heliosdb/data"
memtable_size_mb = 1024
sstable_size_mb = 2048
buffer_pool_mb = 32768
compaction_strategy = "leveled"
compaction_level_multiplier = 20
bloom_filter_bits_per_key = 15

[network]
listen_addr = "0.0.0.0:5432"
rdma_enabled = true
max_message_size_mb = 256
max_connections = 20000

[compute]
worker_threads = 32
io_threads = 8
max_parallel_tasks = 64
simd_enabled = true

[query]
timeout_ms = 600000  # 10 minutes for OLAP
max_memory_mb = 8192
max_result_rows = 10000000
enable_temp_spill = true
max_temp_spill_mb = 51200

[resources]
max_connections = 20000
max_connections_per_user = 200
max_open_files = 65536
idle_timeout_seconds = 900
connection_memory_mb = 512

[memory]
total_limit_mb = 48000
query_limit_mb = 8192
connection_limit_mb = 512

[vector]
index_type = "hnsw"
hnsw_m = 32
hnsw_ef_construction = 400

[gpu]
enabled = true
memory_limit = 16106127360  # 15GB
device_id = 0
enable_aggregations = true
enable_joins = true

Expected Performance:

Throughput: 100,000+ ops/sec for analytical queries
Latency: 50-500ms p95 (OLAP workload)
Concurrent users: 1000+
Max data: 10TB+

Example 4: Edge/IoT Constrained Environment

# Edge/IoT HeliosDB Configuration (512MB RAM, 2 cores)

[storage]
data_dir = "/var/lib/heliosdb/data"
memtable_size_mb = 16
sstable_size_mb = 32
buffer_pool_mb = 256
compaction_strategy = "size_tiered"
gc_grace_seconds = 86400  # 1 day

[network]
listen_addr = "127.0.0.1:5432"
max_connections = 100
max_message_size_mb = 16

[compute]
worker_threads = 2
io_threads = 1
max_parallel_tasks = 2

[query]
timeout_ms = 5000  # 5 seconds
max_memory_mb = 64
max_result_rows = 10000
enable_temp_spill = false

[resources]
max_connections = 100
max_connections_per_user = 5
max_open_files = 1024
idle_timeout_seconds = 60
connection_memory_mb = 32

[memory]
total_limit_mb = 400
query_limit_mb = 64
connection_limit_mb = 32

Expected Performance:

Throughput: 10-100 ops/sec
Latency: 50-500ms p95
Concurrent users: 1-5
Max data: 1-10GB

Capacity Planning Guide

Estimating Total Resource Needs

Formula:

Required_Resources = Baseline + Workload + Safety_Margin

Step 1: Baseline Overhead

CPU: 1-2 cores (OS, background tasks)
RAM: 2-4GB (OS, system services)
Disk: 10% of total data capacity

Step 2: Workload Calculation

For concurrent queries:

RAM needed = (Concurrent_Queries × Avg_Query_Memory)
           + (Connections × Connection_Overhead)
           + (Buffer_Pool_Size)
           + Baseline

For disk requirements:

Disk needed = (Raw_Data × Compression_Factor)
            + (Indexes × Index_Overhead)
            + (WAL × Retention_Days / 365)
            + Baseline

Step 3: Safety Margin

Recommended headroom:

Small deployments: +50% (budget for growth, peaks)
Medium deployments: +30% (account for spikes)
Large deployments: +20% (economies of scale)
Production critical: +40% minimum

Capacity Planning Example

Scenario: 500GB data, 1000 concurrent users, mixed OLTP/OLAP

Step 1: Baseline

CPU: 2 cores
RAM: 3GB
Disk: 50GB

Step 2: Workload

Data storage:

500GB × 7x compression = 71GB compressed
Indexes (4 per table × 15%) = 20GB
WAL 30-day retention = 5GB
Temp space = 18GB (25% of data)
─────────────────────
Total disk: 164GB

Concurrent query memory:

1000 users × 50MB avg connection = 50GB
Concurrent queries (20% of users):
  200 queries × 500MB avg = 100GB
Buffer pool = 16GB
─────────────────────
Total RAM: ~166GB (impractical)

Solution: Connection pooling

1000 users → 5000 pooled connections
Use 500 actual DB connections
500 × 2MB = 1GB overhead
200 concurrent queries × 500MB = 100GB
Buffer pool = 16GB
────────────────────
Total RAM: 117GB (still high)

Alternative: Query optimization

Reduce concurrent OLAP to 10 max (queue excess)
10 × 2GB (larger) = 20GB
500 users × 20MB (smaller) = 10GB
Buffer pool = 16GB
────────────────────
Total RAM: 46GB

Final Capacity Recommendation:

RAM: 64GB (headroom for future growth)
CPU: 32 cores (support parallelism)
Disk: 250GB (network RAID for redundancy)
Network: 10Gbps (support concurrent users)

Growth Projections

Plan for growth using this formula:

Year N = Current Size × (1 + Annual Growth Rate)^N

Example: 500GB current, 30% annual growth

Year	Data Size	Disk Allocation	RAM Needed	CPU Cores
0	500GB	250GB	32GB	16
1	650GB	325GB	40GB	20
2	845GB	425GB	52GB	24
3	1.1TB	550GB	68GB	32
4	1.4TB	700GB	88GB	40
5	1.8TB	900GB	114GB	48

Recommendation: Plan major upgrades every 18-24 months to stay ahead of curve.

Monitoring Resource Usage

Key Metrics to Track

Memory Metrics

-- Query memory usage
SELECT
    query_id,
    user_id,
    memory_used_mb,
    memory_limit_mb,
    (memory_used_mb::float / memory_limit_mb) AS usage_percent,
    CASE
        WHEN memory_used_mb::float / memory_limit_mb > 0.90 THEN 'CRITICAL'
        WHEN memory_used_mb::float / memory_limit_mb > 0.75 THEN 'HIGH'
        WHEN memory_used_mb::float / memory_limit_mb > 0.50 THEN 'MEDIUM'
        ELSE 'LOW'
    END AS pressure_level
FROM system.active_queries
ORDER BY memory_used_mb DESC;

-- System memory pressure
SELECT
    buffer_pool_usage_percent,
    temp_spill_mb,
    eviction_rate,
    cache_hit_rate,
    gc_pause_ms
FROM system.memory_stats;

Connection Metrics

-- Connection usage
SELECT
    COUNT(*) AS total_connections,
    COUNT(DISTINCT user_id) AS unique_users,
    MAX(EXTRACT(EPOCH FROM (NOW() - last_activity))) AS max_idle_seconds,
    COUNT(*) FILTER (WHERE state = 'idle') AS idle_connections,
    COUNT(*) FILTER (WHERE state = 'active') AS active_connections
FROM system.connections;

-- Per-user connection limits
SELECT
    user_id,
    COUNT(*) AS connection_count,
    max_connections_per_user,
    CASE
        WHEN COUNT(*) > max_connections_per_user * 0.9 THEN 'WARNING'
        WHEN COUNT(*) >= max_connections_per_user THEN 'LIMIT_EXCEEDED'
        ELSE 'OK'
    END AS status
FROM system.connections
GROUP BY user_id, max_connections_per_user
HAVING COUNT(*) > max_connections_per_user * 0.7
ORDER BY connection_count DESC;

Disk I/O Metrics

-- File descriptor usage
SELECT
    open_files,
    max_open_files,
    (open_files::float / max_open_files) * 100 AS usage_percent,
    CASE
        WHEN open_files::float / max_open_files > 0.90 THEN 'CRITICAL'
        WHEN open_files::float / max_open_files > 0.75 THEN 'HIGH'
        ELSE 'OK'
    END AS status
FROM system.resource_limits;

-- I/O statistics
SELECT
    reads_per_second,
    writes_per_second,
    read_latency_ms,
    write_latency_ms,
    cache_miss_rate
FROM system.io_stats;

CPU Metrics

-- Query execution distribution
SELECT
    query_type,
    COUNT(*) AS query_count,
    AVG(execution_time_ms) AS avg_time_ms,
    MAX(execution_time_ms) AS max_time_ms,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY execution_time_ms) AS p95_time_ms
FROM system.query_log
WHERE timestamp > NOW() - INTERVAL '1 hour'
GROUP BY query_type
ORDER BY query_count DESC;

-- CPU utilization
SELECT
    worker_id,
    cpu_percent,
    active_queries,
    queued_queries,
    context_switches_per_sec
FROM system.worker_stats;

Alerting Thresholds

Metric	Warning	Critical	Action
Memory Usage	>80%	>95%	Kill idle queries, reduce cache
Connections	>80% limit	At limit	Close idle connections
File Descriptors	>80% limit	>95% limit	Reduce open files
Query Memory	>1GB	>Limit	Spill to disk or reject
CPU Usage	>85% sustained	>98%	Throttle clients, add resources
Disk Free	<20%	<10%	Archive data, add storage
Compaction Lag	>1 hour	>6 hours	Increase compaction threads

Monitoring Setup Example

// HeliosDB monitoring example
use heliosdb_common::ResourceLimitsManager;
use std::time::Duration;

async fn monitor_resources(manager: &ResourceLimitsManager) {
    let mut interval = tokio::time::interval(Duration::from_secs(60));

    loop {
        interval.tick().await;

        let stats = manager.get_statistics();
        let health = manager.health_check().await;

        // Log metrics
        info!("Memory: {}%", stats.memory_usage_percent);
        info!("Connections: {}/{}",
              stats.active_connections,
              stats.max_connections);
        info!("Health: {}", health.summary());

        // Alert on thresholds
        if stats.memory_usage_percent > 90.0 {
            warn!("Memory pressure critical!");
        }

        if stats.active_connections as f64 / stats.max_connections as f64 > 0.9 {
            warn!("Connection limit near threshold");
        }

        if !health.is_healthy {
            error!("System health degraded: {}", health.summary());
        }
    }
}

Workload-Specific Tuning

OLTP Optimization

Characteristics:

Small, frequent queries
Low latency requirement (<100ms p95)
High throughput (1000s ops/sec)
Multi-user contention

Configuration:

[query]
timeout_ms = 10000  # 10 second max
max_memory_mb = 256  # Small queries
max_result_rows = 10000

[resources]
max_connections = 5000  # Many concurrent users
max_connections_per_user = 50

[compute]
worker_threads = <num_cores>  # Use all cores
io_threads = <num_cores / 4>

[cache]
buffer_pool_ratio = 0.60  # 60% of RAM for cache
lru_eviction_enabled = true

Query Optimization:

-- Create covering indexes for common queries
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at)
INCLUDE (order_total);

-- Use prepared statements
PREPARE get_user_orders (INT) AS
SELECT * FROM orders WHERE user_id = $1;

-- Enable query caching for read-heavy workload
SET SESSION query_cache_enabled = true;

Monitoring:

-- Monitor slow queries
SELECT
    query,
    COUNT(*) AS executions,
    AVG(execution_time_ms) AS avg_ms,
    MAX(execution_time_ms) AS max_ms
FROM system.query_log
WHERE execution_time_ms > 100
GROUP BY query
ORDER BY max_ms DESC
LIMIT 20;

OLAP Optimization

Characteristics:

Large, complex queries
Higher latency acceptable (seconds to minutes)
Moderate throughput requirement
Batch processing

Configuration:

[query]
timeout_ms = 600000  # 10 minutes for large scans
max_memory_mb = 8192  # Large aggregate allocations
max_result_rows = 10000000  # Bulk exports

[resources]
max_connections = 100  # Fewer concurrent users
max_connections_per_user = 5

[compute]
worker_threads = <num_cores>
io_threads = <num_cores / 2>

[cache]
buffer_pool_ratio = 0.70  # More RAM for scanning
columnar_cache_enabled = true

[vector]
enable_vectorization = true
hnsw_ef_construction = 400  # Better search quality

Query Optimization:

-- Partitioning for faster scans
CREATE TABLE events (
    id INT,
    timestamp TIMESTAMP,
    data TEXT
) PARTITION BY DATE (timestamp);

-- Materialized views for common aggregations
CREATE MATERIALIZED VIEW daily_stats AS
SELECT
    DATE(timestamp) AS date,
    event_type,
    COUNT(*) AS count,
    AVG(value) AS avg_value
FROM events
GROUP BY DATE(timestamp), event_type;

-- Use column-oriented storage for wide tables
CREATE TABLE analytics_data (
    id INT,
    timestamp TIMESTAMP,
    event_type TEXT,
    user_id INT,
    -- 50+ columns
) WITH (storage_format = 'columnar');

Monitoring:

-- Monitor long-running queries
SELECT
    query_id,
    query,
    EXTRACT(EPOCH FROM (NOW() - start_time)) AS elapsed_seconds,
    memory_used_mb,
    rows_scanned,
    rows_returned
FROM system.active_queries
WHERE EXTRACT(EPOCH FROM (NOW() - start_time)) > 30
ORDER BY elapsed_seconds DESC;

Mixed Workload Balancing

For systems handling both OLTP and OLAP:

Resource Isolation:

[workload_isolation]
# Separate thread pools
oltp_threads = <num_cores / 2>
olap_threads = <num_cores / 2>

# Separate memory pools
oltp_memory_mb = 8192
olap_memory_mb = 32768

# Query routing
oltp_max_query_time_ms = 5000
olap_max_query_time_ms = 600000

Query Classification:

-- Automatic routing based on characteristics
SELECT
    CASE
        WHEN execution_time_ms < 100 THEN 'OLTP'
        WHEN execution_time_ms < 5000 THEN 'OLTP_BOUNDARY'
        WHEN execution_time_ms < 60000 THEN 'MIXED'
        ELSE 'OLAP'
    END AS workload_type,
    COUNT(*) AS frequency
FROM system.query_log
GROUP BY workload_type;

Concurrency Control:

// Implement query queuing for balance
if is_olap_query(&query) && olap_queue.len() > 10 {
    // Throttle aggressive OLAP clients
    apply_backpressure();
} else if is_oltp_query(&query) {
    // Prioritize OLTP
    prioritize_execution();
}

Best Practices

1. Right-Sizing for Your Workload

Measure before deploying: Run load tests with representative data
Use historical data: Analyze past usage patterns
Plan for growth: Add 25-50% capacity headroom
Regular reviews: Adjust configuration quarterly based on actual usage

2. Connection Management

// Use connection pooling
use sqlx::postgres::PgPoolOptions;

let pool = PgPoolOptions::new()
    .max_connections(100)
    .min_connections(10)
    .acquire_timeout(Duration::from_secs(5))
    .connect(&database_url)
    .await?;

// Reuse connections
let connection = pool.acquire().await?;

3. Query Optimization

Use EXPLAIN ANALYZE to understand query plans
Create appropriate indexes based on query patterns
Denormalize when necessary for read-heavy workloads
Monitor query performance continuously

4. Memory Management

Set realistic memory limits based on available resources
Enable spill to disk for temporary large datasets
Monitor cache hit rates (target >85%)
Regularly clear temporary files

5. Disk Space Management

# Monitor disk usage
df -h /var/lib/heliosdb

# Archive old data
heliosdb-cli archive --table events --before 2024-01-01

# Cleanup temporary files
rm -rf /var/lib/heliosdb/temp/*

6. Production Hardening

Set OS-level limits for file descriptors
Enable resource accounting for audit trails
Implement monitoring with alerting
Test failover and recovery procedures
Keep headroom above all limits

7. Performance Baseline

Establish baseline metrics for comparison:

-- Create baseline query
CREATE TABLE baseline_metrics AS
SELECT
    'p95_query_latency' AS metric,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY execution_time_ms) AS value,
    NOW() AS recorded_at
FROM system.query_log
WHERE timestamp > NOW() - INTERVAL '7 days';

8. Graceful Degradation

Design for when limits are approached:

// Implement pressure-aware logic
if memory_manager.get_pressure_level() == PressureLevel::High {
    // Reduce cache sizes
    cache.evict_cold_entries();

    // Increase query timeouts
    adjust_timeouts_for_load();

    // Apply backpressure to clients
    apply_rate_limiting();
}

Troubleshooting Common Issues

Issue: “Too Many Connections”

Root Cause: Connection limit exceeded

Solution:

Check current connection count: SELECT COUNT(*) FROM system.connections;
Close idle connections: ALTER SYSTEM SET idle_timeout = 300;
Implement connection pooling on application side
Increase limit if justified: ALTER SYSTEM SET max_connections = 10000;

Issue: Out of Memory Errors

Root Cause: Query memory or system memory exhausted

Solution:

Check query size: Monitor system.active_queries for memory usage
Enable temp spill: ALTER SYSTEM SET enable_temp_spill = true;
Reduce buffer pool if needed
Add more RAM or upgrade instance

Issue: Slow Query Performance

Root Cause: Insufficient resources or poor index design

Solution:

Check CPU utilization: Should be >70% if bottlenecked
Review buffer pool hit rate: Target >90%
Analyze slow queries: EXPLAIN ANALYZE SELECT ...
Create missing indexes
Consider query rewrite

Issue: File Descriptor Exhaustion

Root Cause: Too many open files

Solution:

Check FD usage: lsof | wc -l
Increase OS limit: ulimit -n 65536
Close unused connections
Reduce concurrent compactions

For more information:

Configuration: /home/app/HeliosDB/heliosdb-common/src/config.rs
Resource Management: /home/app/HeliosDB/docs/quick-starts/performance/RESOURCE_LIMITS_QUICK_REFERENCE.md
Resource Leak Prevention: /home/app/HeliosDB/docs/quick-starts/performance/RESOURCE_LEAK_PREVENTION_QUICK_REFERENCE.md

Last Updated: 2025-12-30 Document Type: Performance Quick Start Reference Audience: DevOps, Database Administrators, Application Architects

HeliosDB Resource Limits Reference

HeliosDB Resource Limits Reference

Table of Contents

Quick Reference Tables

System-Level Limits Summary

Database-Level Limits Summary

System-Level Resource Limits

Memory Limits

Buffer Pool Configuration

Working Memory

Connection Limits

Maximum Connections

Per-User Connection Limits

File Descriptor Limits

System-Level FD Limits

Disk Space Considerations

Data Storage Requirements

CPU Core Utilization

Parallelism Configuration

Database-Level Resource Limits

Query Execution Limits

Query Timeout

Query Memory Limit

Result Set Size Limits

Statement Complexity Limits

Query Optimization Limits

Transaction Size Limits

Maximum Transaction Size

Index Size Limits

Per-Index Limits

Configuration Examples

Example 1: Small Deployment (Development/Testing)

Example 2: Medium Production Deployment

Example 3: Large OLAP Deployment

Example 4: Edge/IoT Constrained Environment

Capacity Planning Guide

Estimating Total Resource Needs

Capacity Planning Example

Growth Projections

Monitoring Resource Usage

Key Metrics to Track

Memory Metrics

Connection Metrics

Disk I/O Metrics

CPU Metrics

Alerting Thresholds

Monitoring Setup Example

Workload-Specific Tuning

OLTP Optimization

OLAP Optimization

Mixed Workload Balancing

Best Practices

1. Right-Sizing for Your Workload

2. Connection Management

3. Query Optimization

4. Memory Management

5. Disk Space Management

6. Production Hardening

7. Performance Baseline

8. Graceful Degradation

Troubleshooting Common Issues

Issue: “Too Many Connections”

Issue: Out of Memory Errors

Issue: Slow Query Performance

Issue: File Descriptor Exhaustion