Dynamic Autoscaling User Guide

Overview

HeliosDB’s autoscaling framework automatically adjusts compute resources (CUs) from 0 to maximum based on workload demand, achieving sub-10-second scale-up and cost optimization through intelligent resource management.

Benefits

Dynamic 0-to-max CU scaling based on real-time metrics
Sub-10s scale-up, sub-60s scale-down
28%+ cost savings through idle time optimization
No oscillation with damping mechanism
Vertical and horizontal scaling support

Key Features

CPU, memory, and queue depth monitoring
Configurable scaling policies
Automatic scale-to-zero for idle workloads
Real-time metrics and Prometheus integration
Per-second billing accuracy

Prerequisites

HeliosDB v3.2+ with autoscale module
4GB+ RAM for resource manager
Monitoring tools (Prometheus recommended)

Step-by-Step Configuration

1. Enable Autoscaling

/etc/heliosdb/autoscaling.yaml:

autoscaling:
  enabled: true
  min_cu: 0.0  # Scale to zero
  max_cu: 16.0  # Maximum per node
  target_cpu: 70  # Target CPU utilization
  target_queue: 10  # Target queue depth

  # Thresholds
  scale_up_threshold: 80  # Scale up at 80% CPU
  scale_down_threshold: 30  # Scale down at 30% CPU

  # Timing
  cooldown_period: 60  # Seconds between scaling
  scale_step_percent: 50.0  # Scale by 50% each time

  # Idle handling
  scale_to_zero_idle_seconds: 300  # 5 min idle timeout

  # Damping (prevent oscillation)
  damping_factor: 0.8

2. Configure Monitoring

activity_monitor:
  idle_timeout_seconds: 300
  activity_threshold_seconds: 1
  activity_window_size: 1000
  sample_interval_seconds: 10
  track_readonly_queries: true
  track_system_queries: false

3. Resource Limits

resources:
  total_cpu_cores: 64
  total_memory_mb: 262144  # 256 GB
  min_cpu_cores: 1
  min_memory_mb: 2048  # 2 GB
  max_cpu_cores: 16
  max_memory_mb: 65536  # 64 GB

SQL Examples

Monitoring Autoscaling

-- Check current autoscaling status
SELECT
    node_id,
    current_cu,
    target_cpu_percent,
    actual_cpu_percent,
    queue_depth,
    last_scaling_action,
    last_scaling_time
FROM heliosdb.autoscale_status;

Viewing Scaling History

-- Recent scaling events
SELECT
    timestamp,
    event_type,  -- ScaleUp, ScaleDown, NoChange
    from_cu,
    to_cu,
    reason,
    confidence,
    duration_ms
FROM heliosdb.autoscaling_events
WHERE timestamp > now() - interval '24 hours'
ORDER BY timestamp DESC;

Manual Scaling

-- Temporarily disable autoscaling
SELECT heliosdb.disable_autoscaling('node-1');

-- Manually set CU level
SELECT heliosdb.set_cu('node-1', 4.0);

-- Re-enable autoscaling
SELECT heliosdb.enable_autoscaling('node-1');

Performance Metrics

-- Autoscaling effectiveness
SELECT
    DATE_TRUNC('hour', timestamp) as hour,
    AVG(cpu_utilization) as avg_cpu,
    AVG(current_cu) as avg_cu,
    COUNT(CASE WHEN event_type = 'ScaleUp' THEN 1 END) as scale_ups,
    COUNT(CASE WHEN event_type = 'ScaleDown' THEN 1 END) as scale_downs
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '24 hours'
GROUP BY hour
ORDER BY hour;

Common Use Cases

Use Case 1: E-commerce Website

Traffic Pattern: High during business hours, low overnight

autoscaling:
  min_cu: 0.5  # Minimum 0.5 CU (never zero for web)
  max_cu: 8.0
  target_cpu: 65
  scale_up_threshold: 75
  scale_down_threshold: 35
  cooldown_period: 120  # 2 min (more conservative)
  scale_to_zero_idle_seconds: 3600  # 1 hour

Expected Behavior:

8 AM - 5 PM: 2-6 CU (busy hours)
5 PM - 11 PM: 1-3 CU (evening traffic)
11 PM - 8 AM: 0.5 CU (maintenance only)

Use Case 2: Batch Processing

Pattern: Periodic batch jobs every 4 hours

autoscaling:
  min_cu: 0.0  # Scale to zero between batches
  max_cu: 16.0  # Full power during batch
  scale_up_threshold: 70  # Quick scale-up
  scale_down_threshold: 20  # Aggressive scale-down
  cooldown_period: 30  # Fast response
  scale_to_zero_idle_seconds: 180  # 3 min

Use Case 3: Development Environment

Pattern: Used only during work hours

autoscaling:
  min_cu: 0.0
  max_cu: 4.0  # Lower max for dev
  scale_to_zero_idle_seconds: 600  # 10 min
  scale_step_percent: 100.0  # Double/halve each time

Troubleshooting

Issue: Frequent Scaling (Oscillation)

Symptom: Node scales up/down repeatedly

Diagnosis:

SELECT
    DATE_TRUNC('minute', timestamp) as minute,
    COUNT(*) as scaling_events
FROM heliosdb.autoscaling_events
WHERE timestamp > now() - interval '1 hour'
GROUP BY minute
HAVING COUNT(*) > 2;  -- More than 2 events/minute is problematic

Solution 1: Increase Cooldown

autoscaling:
  cooldown_period: 180  # Increase from 60

Solution 2: Increase Damping

autoscaling:
  damping_factor: 0.9  # Increase from 0.8

Solution 3: Widen Threshold Gap

autoscaling:
  scale_up_threshold: 85  # Increase from 80
  scale_down_threshold: 25  # Decrease from 30
  # Wider gap prevents oscillation

Issue: Slow Scale-Up

Symptom: Users experience latency during traffic spikes

Solution: More Aggressive Scale-Up

autoscaling:
  scale_up_threshold: 70  # Lower threshold
  scale_step_percent: 100.0  # Double instead of 50% increase
  cooldown_period: 30  # Faster response

Issue: High Costs

Symptom: Autoscaling not reducing costs as expected

Diagnosis:

-- Analyze CU utilization
SELECT
    DATE(timestamp) as date,
    AVG(current_cu) as avg_cu,
    MIN(current_cu) as min_cu,
    MAX(current_cu) as max_cu,
    AVG(cpu_utilization) as avg_cpu
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '7 days'
GROUP BY date;

Solution: More Aggressive Scale-Down

autoscaling:
  scale_down_threshold: 40  # Increase from 30
  scale_to_zero_idle_seconds: 180  # Reduce from 300

Performance Tuning

1. Optimize for Workload Type

OLTP (Transactional):

autoscaling:
  target_cpu: 60  # Lower target for better responsiveness
  target_queue: 5  # Low queue tolerance
  scale_up_threshold: 70

OLAP (Analytical):

autoscaling:
  target_cpu: 80  # Higher utilization OK
  target_queue: 20  # Can tolerate queuing
  scale_down_threshold: 40  # Keep resources longer

Mixed Workload:

autoscaling:
  target_cpu: 70
  target_queue: 10
  scale_up_threshold: 80
  scale_down_threshold: 30

2. Time-Based Policies

-- Create scheduled scaling
CREATE FUNCTION business_hours_policy()
RETURNS TRIGGER AS $$
BEGIN
    IF EXTRACT(HOUR FROM now()) BETWEEN 8 AND 17 THEN
        -- Business hours: maintain minimum 2 CU
        ALTER SYSTEM SET heliosdb.autoscale_min_cu = 2.0;
    ELSE
        -- Off-hours: allow scale to zero
        ALTER SYSTEM SET heliosdb.autoscale_min_cu = 0.0;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

-- Schedule hourly check
SELECT cron.schedule('autoscale-policy', '0 * * * *',
    'SELECT business_hours_policy()');

3. Predictive Scaling

-- Analyze historical patterns
CREATE MATERIALIZED VIEW scaling_patterns AS
SELECT
    EXTRACT(DOW FROM timestamp) as day_of_week,
    EXTRACT(HOUR FROM timestamp) as hour_of_day,
    AVG(current_cu) as typical_cu,
    AVG(cpu_utilization) as typical_cpu
FROM heliosdb.autoscale_metrics
WHERE timestamp > now() - interval '30 days'
GROUP BY day_of_week, hour_of_day;

-- Pre-scale based on pattern
SELECT heliosdb.set_target_cu(
    (SELECT typical_cu FROM scaling_patterns
     WHERE day_of_week = EXTRACT(DOW FROM now())
       AND hour_of_day = EXTRACT(HOUR FROM now() + interval '10 minutes'))
);

Best Practices

1. Start Conservative

# Initial configuration
autoscaling:
  min_cu: 1.0  # Don't start with 0
  max_cu: 8.0  # Don't start with max
  cooldown_period: 120  # Longer cooldown
  scale_to_zero_idle_seconds: 900  # 15 min

After monitoring for a week, gradually optimize:

# Optimized after observation
autoscaling:
  min_cu: 0.0  # Now confident to scale to zero
  max_cu: 16.0  # Increase max after testing
  cooldown_period: 60
  scale_to_zero_idle_seconds: 300

2. Monitor Before Optimizing

-- Weekly performance review
SELECT
    'Scale-ups' as metric,
    COUNT(*) as count,
    AVG(duration_ms) as avg_duration_ms
FROM heliosdb.autoscaling_events
WHERE event_type = 'ScaleUp'
  AND timestamp > now() - interval '7 days'
UNION ALL
SELECT
    'Scale-downs',
    COUNT(*),
    AVG(duration_ms)
FROM heliosdb.autoscaling_events
WHERE event_type = 'ScaleDown'
  AND timestamp > now() - interval '7 days';

3. Alert on Anomalies

-- Create alert for excessive scaling
CREATE FUNCTION check_scaling_frequency()
RETURNS void AS $$
DECLARE
    events_per_hour integer;
BEGIN
    SELECT COUNT(*) INTO events_per_hour
    FROM heliosdb.autoscaling_events
    WHERE timestamp > now() - interval '1 hour';

    IF events_per_hour > 10 THEN
        RAISE WARNING 'Excessive scaling: % events in last hour', events_per_hour;
    END IF;
END;
$$ LANGUAGE plpgsql;

SELECT cron.schedule('check-scaling', '*/15 * * * *',
    'SELECT check_scaling_frequency()');

4. Cost Tracking

-- Daily cost analysis
CREATE VIEW daily_autoscale_costs AS
SELECT
    DATE(timestamp) as date,
    AVG(current_cu) as avg_cu,
    SUM(current_cu * EXTRACT(EPOCH FROM
        LEAD(timestamp, 1, now()) OVER (ORDER BY timestamp) - timestamp
    ) / 3600.0) as cu_hours,
    SUM(current_cu * EXTRACT(EPOCH FROM
        LEAD(timestamp, 1, now()) OVER (ORDER BY timestamp) - timestamp
    ) / 3600.0) * 0.10 as cost_usd
FROM heliosdb.autoscale_metrics
GROUP BY date
ORDER BY date DESC;

Advanced Topics

Horizontal Scaling

Enable multi-node scaling:

autoscaling:
  horizontal:
    enabled: true
    min_nodes: 1
    max_nodes: 10
    node_max_cu: 16.0
    scale_out_cpu_threshold: 85.0
    scale_in_cpu_threshold: 20.0
    drain_timeout_seconds: 120

Custom Metrics

-- Register custom scaling metric
CREATE FUNCTION custom_scaling_metric()
RETURNS float AS $$
BEGIN
    -- Custom logic: e.g., connection count weighted by query complexity
    RETURN (
        SELECT
            COUNT(*) * AVG(query_complexity_score)
        FROM pg_stat_activity
        WHERE state = 'active'
    );
END;
$$ LANGUAGE plpgsql;

-- Use in scaling policy
ALTER SYSTEM SET heliosdb.autoscale_custom_metric = 'custom_scaling_metric()';

Integration with Load Balancer

# haproxy.cfg
backend heliosdb_servers
    balance leastconn
    option httpchk GET /autoscale/status

    # Dynamic server addition based on autoscaling
    server-template node 1-10 heliosdb-{1..10}:5432 check

Monitoring

Prometheus Metrics

scrape_configs:
  - job_name: 'heliosdb_autoscale'
    static_configs:
      - targets: ['heliosdb:9090']
    metrics_path: '/metrics/autoscale'

Key metrics:

heliosdb_autoscale_current_cu
heliosdb_autoscale_target_cpu_percent
heliosdb_autoscale_actual_cpu_percent
heliosdb_autoscale_queue_depth
heliosdb_autoscale_scaling_operations_total
heliosdb_autoscale_scaling_duration_seconds

Grafana Dashboard

Create dashboard with:

Current CU level (gauge)
CPU/Memory utilization (time series)
Queue depth (time series)
Scaling events (annotations)
Cost trend (time series)

Conclusion

HeliosDB’s autoscaling provides intelligent resource management with significant cost savings. Configure policies based on your workload, monitor performance, and iterate for optimal results.

Key Takeaways:

Start conservative, optimize over time
Monitor scaling patterns for a week before tuning
Use damping to prevent oscillation
Set appropriate cooldown periods
Track costs and adjust policies

For more information:

Architecture: /docs/architecture/autoscaling.md
API Reference: /docs/api/autoscale.md
Cost Optimization: /docs/guides/cost-optimization.md