Skip to content

Smart Data Rebalancing User Guide

Smart Data Rebalancing User Guide

Overview

Automatic data rebalancing in HeliosDB redistributes data across nodes when capacity imbalances occur or nodes are added/removed from the cluster.

Benefits

  • Automatic load balancing
  • Zero-downtime rebalancing
  • Configurable rebalancing strategies
  • Bandwidth throttling to minimize impact

Prerequisites

  • HeliosDB v3.2+ cluster with 2+ nodes
  • Raft-based metadata service
  • Network bandwidth for data transfer

Configuration

rebalancing:
enabled: true
strategy: capacity_based # Or: access_based, hybrid
trigger_threshold_percent: 10 # Rebalance if >10% imbalance
max_concurrent_moves: 5
bandwidth_limit_mbps: 100
check_interval_seconds: 3600 # Check hourly

SQL Examples

Manual Rebalancing

-- Trigger immediate rebalance
SELECT heliosdb.trigger_rebalance();
-- Rebalance specific table
SELECT heliosdb.rebalance_table('large_table');

Monitor Rebalancing

-- Check rebalancing status
SELECT
node_id,
capacity_gb,
used_gb,
balance_score -- 1.0 = perfectly balanced
FROM heliosdb.node_capacity;
-- View active rebalance operations
SELECT * FROM heliosdb.active_rebalances;

Use Cases

Node Addition

-- Add new node
SELECT heliosdb.add_node('node-4', 'host4:5432');
-- Automatic rebalancing moves ~25% data to new node

Capacity Imbalance

-- Detect imbalance
SELECT
MAX(used_gb) - MIN(used_gb) as imbalance_gb
FROM heliosdb.node_capacity;
-- Rebalance if needed
SELECT heliosdb.trigger_rebalance()
WHERE (SELECT MAX(used_gb) - MIN(used_gb)
FROM heliosdb.node_capacity) > 100;

Troubleshooting

Slow Rebalancing

Solution: Increase bandwidth

rebalancing:
bandwidth_limit_mbps: 200 # Increase from 100
max_concurrent_moves: 10 # Increase parallelism

Rebalancing Impact

Solution: Schedule during off-hours

-- Enable rebalancing only during off-hours
ALTER SYSTEM SET heliosdb.rebalance_schedule = '0 2 * * *'; -- 2 AM daily

Best Practices

  1. Monitor cluster balance regularly
  2. Enable automatic rebalancing for production
  3. Set appropriate bandwidth limits
  4. Schedule major rebalances during low-traffic periods

For more: /docs/architecture/rebalancing.md