Skip to content

Dynamic Autoscaling User Guide

Dynamic Autoscaling User Guide

Overview

HeliosDB’s autoscaling framework automatically adjusts compute resources (CUs) from 0 to maximum based on workload demand, achieving sub-10-second scale-up and cost optimization through intelligent resource management.

Benefits

  • Dynamic 0-to-max CU scaling based on real-time metrics
  • Sub-10s scale-up, sub-60s scale-down
  • 28%+ cost savings through idle time optimization
  • No oscillation with damping mechanism
  • Vertical and horizontal scaling support

Key Features

  • CPU, memory, and queue depth monitoring
  • Configurable scaling policies
  • Automatic scale-to-zero for idle workloads
  • Real-time metrics and Prometheus integration
  • Per-second billing accuracy

Prerequisites

  • HeliosDB v3.2+ with autoscale module
  • 4GB+ RAM for resource manager
  • Monitoring tools (Prometheus recommended)

Step-by-Step Configuration

1. Enable Autoscaling

/etc/heliosdb/autoscaling.yaml:

autoscaling:
enabled: true
min_cu: 0.0 # Scale to zero
max_cu: 16.0 # Maximum per node
target_cpu: 70 # Target CPU utilization
target_queue: 10 # Target queue depth
# Thresholds
scale_up_threshold: 80 # Scale up at 80% CPU
scale_down_threshold: 30 # Scale down at 30% CPU
# Timing
cooldown_period: 60 # Seconds between scaling
scale_step_percent: 50.0 # Scale by 50% each time
# Idle handling
scale_to_zero_idle_seconds: 300 # 5 min idle timeout
# Damping (prevent oscillation)
damping_factor: 0.8

2. Configure Monitoring

activity_monitor:
idle_timeout_seconds: 300
activity_threshold_seconds: 1
activity_window_size: 1000
sample_interval_seconds: 10
track_readonly_queries: true
track_system_queries: false

3. Resource Limits

resources:
total_cpu_cores: 64
total_memory_mb: 262144 # 256 GB
min_cpu_cores: 1
min_memory_mb: 2048 # 2 GB
max_cpu_cores: 16
max_memory_mb: 65536 # 64 GB

SQL Examples

Monitoring Autoscaling

-- Check current autoscaling status
SELECT
node_id,
current_cu,
target_cpu_percent,
actual_cpu_percent,
queue_depth,
last_scaling_action,
last_scaling_time
FROM heliosdb.autoscale_status;

Viewing Scaling History

-- Recent scaling events
SELECT
timestamp,
event_type, -- ScaleUp, ScaleDown, NoChange
from_cu,
to_cu,
reason,
confidence,
duration_ms
FROM heliosdb.autoscaling_events
WHERE timestamp > now() - interval '24 hours'
ORDER BY timestamp DESC;

Manual Scaling

-- Temporarily disable autoscaling
SELECT heliosdb.disable_autoscaling('node-1');
-- Manually set CU level
SELECT heliosdb.set_cu('node-1', 4.0);
-- Re-enable autoscaling
SELECT heliosdb.enable_autoscaling('node-1');

Performance Metrics

-- Autoscaling effectiveness
SELECT
DATE_TRUNC('hour', timestamp) as hour,
AVG(cpu_utilization) as avg_cpu,
AVG(current_cu) as avg_cu,
COUNT(CASE WHEN event_type = 'ScaleUp' THEN 1 END) as scale_ups,
COUNT(CASE WHEN event_type = 'ScaleDown' THEN 1 END) as scale_downs
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '24 hours'
GROUP BY hour
ORDER BY hour;

Common Use Cases

Use Case 1: E-commerce Website

Traffic Pattern: High during business hours, low overnight

autoscaling:
min_cu: 0.5 # Minimum 0.5 CU (never zero for web)
max_cu: 8.0
target_cpu: 65
scale_up_threshold: 75
scale_down_threshold: 35
cooldown_period: 120 # 2 min (more conservative)
scale_to_zero_idle_seconds: 3600 # 1 hour

Expected Behavior:

  • 8 AM - 5 PM: 2-6 CU (busy hours)
  • 5 PM - 11 PM: 1-3 CU (evening traffic)
  • 11 PM - 8 AM: 0.5 CU (maintenance only)

Use Case 2: Batch Processing

Pattern: Periodic batch jobs every 4 hours

autoscaling:
min_cu: 0.0 # Scale to zero between batches
max_cu: 16.0 # Full power during batch
scale_up_threshold: 70 # Quick scale-up
scale_down_threshold: 20 # Aggressive scale-down
cooldown_period: 30 # Fast response
scale_to_zero_idle_seconds: 180 # 3 min

Use Case 3: Development Environment

Pattern: Used only during work hours

autoscaling:
min_cu: 0.0
max_cu: 4.0 # Lower max for dev
scale_to_zero_idle_seconds: 600 # 10 min
scale_step_percent: 100.0 # Double/halve each time

Troubleshooting

Issue: Frequent Scaling (Oscillation)

Symptom: Node scales up/down repeatedly

Diagnosis:

SELECT
DATE_TRUNC('minute', timestamp) as minute,
COUNT(*) as scaling_events
FROM heliosdb.autoscaling_events
WHERE timestamp > now() - interval '1 hour'
GROUP BY minute
HAVING COUNT(*) > 2; -- More than 2 events/minute is problematic

Solution 1: Increase Cooldown

autoscaling:
cooldown_period: 180 # Increase from 60

Solution 2: Increase Damping

autoscaling:
damping_factor: 0.9 # Increase from 0.8

Solution 3: Widen Threshold Gap

autoscaling:
scale_up_threshold: 85 # Increase from 80
scale_down_threshold: 25 # Decrease from 30
# Wider gap prevents oscillation

Issue: Slow Scale-Up

Symptom: Users experience latency during traffic spikes

Solution: More Aggressive Scale-Up

autoscaling:
scale_up_threshold: 70 # Lower threshold
scale_step_percent: 100.0 # Double instead of 50% increase
cooldown_period: 30 # Faster response

Issue: High Costs

Symptom: Autoscaling not reducing costs as expected

Diagnosis:

-- Analyze CU utilization
SELECT
DATE(timestamp) as date,
AVG(current_cu) as avg_cu,
MIN(current_cu) as min_cu,
MAX(current_cu) as max_cu,
AVG(cpu_utilization) as avg_cpu
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '7 days'
GROUP BY date;

Solution: More Aggressive Scale-Down

autoscaling:
scale_down_threshold: 40 # Increase from 30
scale_to_zero_idle_seconds: 180 # Reduce from 300

Performance Tuning

1. Optimize for Workload Type

OLTP (Transactional):

autoscaling:
target_cpu: 60 # Lower target for better responsiveness
target_queue: 5 # Low queue tolerance
scale_up_threshold: 70

OLAP (Analytical):

autoscaling:
target_cpu: 80 # Higher utilization OK
target_queue: 20 # Can tolerate queuing
scale_down_threshold: 40 # Keep resources longer

Mixed Workload:

autoscaling:
target_cpu: 70
target_queue: 10
scale_up_threshold: 80
scale_down_threshold: 30

2. Time-Based Policies

-- Create scheduled scaling
CREATE FUNCTION business_hours_policy()
RETURNS TRIGGER AS $$
BEGIN
IF EXTRACT(HOUR FROM now()) BETWEEN 8 AND 17 THEN
-- Business hours: maintain minimum 2 CU
ALTER SYSTEM SET heliosdb.autoscale_min_cu = 2.0;
ELSE
-- Off-hours: allow scale to zero
ALTER SYSTEM SET heliosdb.autoscale_min_cu = 0.0;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Schedule hourly check
SELECT cron.schedule('autoscale-policy', '0 * * * *',
'SELECT business_hours_policy()');

3. Predictive Scaling

-- Analyze historical patterns
CREATE MATERIALIZED VIEW scaling_patterns AS
SELECT
EXTRACT(DOW FROM timestamp) as day_of_week,
EXTRACT(HOUR FROM timestamp) as hour_of_day,
AVG(current_cu) as typical_cu,
AVG(cpu_utilization) as typical_cpu
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '30 days'
GROUP BY day_of_week, hour_of_day;
-- Pre-scale based on pattern
SELECT heliosdb.set_target_cu(
(SELECT typical_cu FROM scaling_patterns
WHERE day_of_week = EXTRACT(DOW FROM now())
AND hour_of_day = EXTRACT(HOUR FROM now() + interval '10 minutes'))
);

Best Practices

1. Start Conservative

# Initial configuration
autoscaling:
min_cu: 1.0 # Don't start with 0
max_cu: 8.0 # Don't start with max
cooldown_period: 120 # Longer cooldown
scale_to_zero_idle_seconds: 900 # 15 min

After monitoring for a week, gradually optimize:

# Optimized after observation
autoscaling:
min_cu: 0.0 # Now confident to scale to zero
max_cu: 16.0 # Increase max after testing
cooldown_period: 60
scale_to_zero_idle_seconds: 300

2. Monitor Before Optimizing

-- Weekly performance review
SELECT
'Scale-ups' as metric,
COUNT(*) as count,
AVG(duration_ms) as avg_duration_ms
FROM heliosdb.autoscaling_events
WHERE event_type = 'ScaleUp'
AND timestamp > now() - interval '7 days'
UNION ALL
SELECT
'Scale-downs',
COUNT(*),
AVG(duration_ms)
FROM heliosdb.autoscaling_events
WHERE event_type = 'ScaleDown'
AND timestamp > now() - interval '7 days';

3. Alert on Anomalies

-- Create alert for excessive scaling
CREATE FUNCTION check_scaling_frequency()
RETURNS void AS $$
DECLARE
events_per_hour integer;
BEGIN
SELECT COUNT(*) INTO events_per_hour
FROM heliosdb.autoscaling_events
WHERE timestamp > now() - interval '1 hour';
IF events_per_hour > 10 THEN
RAISE WARNING 'Excessive scaling: % events in last hour', events_per_hour;
END IF;
END;
$$ LANGUAGE plpgsql;
SELECT cron.schedule('check-scaling', '*/15 * * * *',
'SELECT check_scaling_frequency()');

4. Cost Tracking

-- Daily cost analysis
CREATE VIEW daily_autoscale_costs AS
SELECT
DATE(timestamp) as date,
AVG(current_cu) as avg_cu,
SUM(current_cu * EXTRACT(EPOCH FROM
LEAD(timestamp, 1, now()) OVER (ORDER BY timestamp) - timestamp
) / 3600.0) as cu_hours,
SUM(current_cu * EXTRACT(EPOCH FROM
LEAD(timestamp, 1, now()) OVER (ORDER BY timestamp) - timestamp
) / 3600.0) * 0.10 as cost_usd
FROM heliosdb.autoscale_metrics
GROUP BY date
ORDER BY date DESC;

Advanced Topics

Horizontal Scaling

Enable multi-node scaling:

autoscaling:
horizontal:
enabled: true
min_nodes: 1
max_nodes: 10
node_max_cu: 16.0
scale_out_cpu_threshold: 85.0
scale_in_cpu_threshold: 20.0
drain_timeout_seconds: 120

Custom Metrics

-- Register custom scaling metric
CREATE FUNCTION custom_scaling_metric()
RETURNS float AS $$
BEGIN
-- Custom logic: e.g., connection count weighted by query complexity
RETURN (
SELECT
COUNT(*) * AVG(query_complexity_score)
FROM pg_stat_activity
WHERE state = 'active'
);
END;
$$ LANGUAGE plpgsql;
-- Use in scaling policy
ALTER SYSTEM SET heliosdb.autoscale_custom_metric = 'custom_scaling_metric()';

Integration with Load Balancer

# haproxy.cfg
backend heliosdb_servers
balance leastconn
option httpchk GET /autoscale/status
# Dynamic server addition based on autoscaling
server-template node 1-10 heliosdb-{1..10}:5432 check

Monitoring

Prometheus Metrics

prometheus.yml
scrape_configs:
- job_name: 'heliosdb_autoscale'
static_configs:
- targets: ['heliosdb:9090']
metrics_path: '/metrics/autoscale'

Key metrics:

  • heliosdb_autoscale_current_cu
  • heliosdb_autoscale_target_cpu_percent
  • heliosdb_autoscale_actual_cpu_percent
  • heliosdb_autoscale_queue_depth
  • heliosdb_autoscale_scaling_operations_total
  • heliosdb_autoscale_scaling_duration_seconds

Grafana Dashboard

Create dashboard with:

  1. Current CU level (gauge)
  2. CPU/Memory utilization (time series)
  3. Queue depth (time series)
  4. Scaling events (annotations)
  5. Cost trend (time series)

Conclusion

HeliosDB’s autoscaling provides intelligent resource management with significant cost savings. Configure policies based on your workload, monitor performance, and iterate for optimal results.

Key Takeaways:

  • Start conservative, optimize over time
  • Monitor scaling patterns for a week before tuning
  • Use damping to prevent oscillation
  • Set appropriate cooldown periods
  • Track costs and adjust policies

For more information:

  • Architecture: /docs/architecture/autoscaling.md
  • API Reference: /docs/api/autoscale.md
  • Cost Optimization: /docs/guides/cost-optimization.md