High Availability (HA)

HeliosDB-Lite provides multi-tier high availability through WAL streaming, multi-primary replication, and sharding.

HA Tiers Overview

Tier	Name	Architecture	Use Case
Tier 1	Warm Standby	Active-Passive	Basic HA, disaster recovery
Tier 2	Multi-Primary	Active-Active	Geographic distribution
Tier 3	Sharding	Distributed	Horizontal scaling

Feature Flags

Enable HA features via Cargo feature flags:

[dependencies]
heliosdb-lite = { version = "3.4", features = ["ha-tier1"] }

Feature	Description
`ha-tier1`	Warm standby replication
`ha-tier2`	Multi-primary replication
`ha-tier3`	Sharding support
`ha-dedup`	Content-addressed deduplication
`ha-branch-replication`	Branch-to-server replication

Tier 1: Warm Standby

Active-passive replication with automatic failover.

Architecture

┌─────────────┐     WAL Stream    ┌─────────────┐
│   Primary   │ ───────────────→ │   Standby   │
│   (Active)  │                   │  (Passive)  │
└─────────────┘                   └─────────────┘
       ↓                                 ↓
   Read/Write                       Read-Only

Components

Component	Description
`WalReplicator`	Streams WAL from primary
`WalApplicator`	Applies WAL on standby
`FailoverWatcher`	Monitors primary health
`LsnManager`	Tracks replication position
`SplitBrainProtector`	Prevents dual-primary scenarios

Configuration

use heliosdb_lite::replication::{ReplicationConfig, SyncMode};

let config = ReplicationConfig::builder()
    .primary_endpoint("primary.example.com:5432")
    .sync_mode(SyncMode::Synchronous)  // or Asynchronous
    .build();

Sync Modes

Mode	Description	Durability	Latency
Synchronous	Wait for standby ACK	Strong	Higher
Asynchronous	Fire-and-forget	Eventual	Lower
Quorum	Wait for N/2+1 ACKs	Configurable	Medium

Failover

use heliosdb_lite::replication::FailoverWatcher;

let watcher = FailoverWatcher::new(config);
watcher.on_failover(|event| {
    println!("Failover triggered: {:?}", event);
    // Promote standby to primary
});

Split-Brain Protection

use heliosdb_lite::replication::{SplitBrainProtector, ObserverConfig};

let protector = SplitBrainProtector::new(ObserverConfig {
    observers: vec!["observer1.example.com", "observer2.example.com"],
    quorum_size: 2,
});

protector.start();

Tier 2: Multi-Primary

Active-active replication with conflict resolution.

Architecture

┌─────────────┐   Branch Sync   ┌─────────────┐
│  Region A   │ ←─────────────→ │  Region B   │
│  (Primary)  │                 │  (Primary)  │
└─────────────┘                 └─────────────┘
    ↓     ↓                       ↓     ↓
  Writes Reads                  Writes Reads

Components

Component	Description
`MultiPrimarySyncManager`	Coordinates multi-region sync
`ConflictMergeEngine`	Resolves write conflicts
`RegionCoordinator`	Manages region topology

Conflict Resolution Strategies

Strategy	Description	Use Case
Last-Write-Wins	Timestamp-based	Simple, no conflicts visible
Branch-Wins	Prefer local changes	Low-latency local writes
Merge	Combine changes	Collaborative editing
Custom	User-defined logic	Complex business rules

Configuration

use heliosdb_lite::replication::{
    MultiPrimarySyncManager,
    ConflictResolution,
};

let sync = MultiPrimarySyncManager::new()
    .add_region("us-east", "us-east.example.com:5432")
    .add_region("eu-west", "eu-west.example.com:5432")
    .conflict_resolution(ConflictResolution::LastWriteWins)
    .build();

Branch-Based Replication

Multi-primary uses HeliosDB-Lite’s branching for conflict-free merges:

-- Each region maintains its own branch
-- Sync merges branches across regions

-- Region A writes
INSERT INTO orders (id, total) VALUES (1, 100);

-- Region B writes (concurrent)
INSERT INTO orders (id, total) VALUES (2, 200);

-- After sync: both rows present in all regions

Tier 3: Sharding

Horizontal scaling with consistent hashing.

Architecture

                    ┌─────────────┐
                    │   Router    │
                    └──────┬──────┘
           ┌───────────────┼───────────────┐
           ↓               ↓               ↓
    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │ Shard 1  │    │ Shard 2  │    │ Shard 3  │
    │  (0-33%) │    │ (34-66%) │    │ (67-100%)│
    └──────────┘    └──────────┘    └──────────┘

Components

Component	Description
`HashRing`	Consistent hashing for key distribution
`ShardRouter`	Routes queries to correct shard
`ReshardManager`	Online resharding with minimal downtime
`VectorPartitioner`	Special partitioning for vector data

Sharding Strategies

Strategy	Description	Best For
Hash	Consistent hash of shard key	Even distribution
Range	Key ranges per shard	Time-series data
Geographic	Location-based routing	Multi-region
Vector	Centroid-based partitioning	Vector search

Configuration

use heliosdb_lite::replication::{HashRing, ShardRouter};

let ring = HashRing::new()
    .add_node("shard1.example.com:5432", 100)  // weight: 100
    .add_node("shard2.example.com:5432", 100)
    .add_node("shard3.example.com:5432", 100)
    .build();

let router = ShardRouter::new(ring)
    .shard_key("tenant_id")  // Shard by tenant
    .build();

Vector Partitioning

Special support for vector workloads:

use heliosdb_lite::replication::{VectorPartitioner, CentroidManager};

let partitioner = VectorPartitioner::new()
    .dimensions(768)
    .num_centroids(16)  // 16 partitions based on vector similarity
    .build();

// Vectors routed to shard containing nearest centroid

Resharding

Online resharding without downtime:

use heliosdb_lite::replication::ReshardManager;

let reshard = ReshardManager::new(ring)
    .target_shards(6)  // Scale from 3 to 6 shards
    .parallel_streams(4)
    .build();

reshard.execute().await?;  // Non-blocking migration

Logical Replication

For selective table replication:

use heliosdb_lite::replication::{
    LogicalReplicationPipeline,
    TableFilter,
    ColumnMapping,
};

let pipeline = LogicalReplicationPipeline::new()
    .source("source.example.com:5432")
    .destination("dest.example.com:5432")
    .table_filter(TableFilter::include(&["users", "orders"]))
    .column_mapping(ColumnMapping::new()
        .rename("old_name", "new_name")
        .exclude("sensitive_column"))
    .build();

pipeline.start().await?;

CLI Options

Start HeliosDB-Lite in HA mode:

# Primary mode
heliosdb-lite server --ha-mode primary --ha-bind 0.0.0.0:5433

# Standby mode
heliosdb-lite server --ha-mode standby --ha-primary primary.example.com:5433

# Multi-primary mode
heliosdb-lite server --ha-mode multi-primary \
    --ha-region us-east \
    --ha-peers eu-west.example.com:5433

Docker Support

Docker Compose for HA cluster:

version: '3.8'
services:
  primary:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode primary
    ports:
      - "5432:5432"
      - "5433:5433"
    environment:
      - HA_SYNC_MODE=synchronous

  standby:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode standby --ha-primary primary:5433
    depends_on:
      - primary

Transparent Write Routing (TWR)

HeliosDB-Lite implements Transparent Write Routing (TWR) - an innovative feature that allows applications to connect to any node (primary or standby) and have writes automatically routed to the primary.

How It Works

Application → Standby → (DML/DDL forwarded) → Primary
                ↓
         (SELECT executed locally)

Behavior by Sync Mode

Sync Mode	DQL (SELECT)	DML (INSERT/UPDATE/DELETE)
sync	Execute locally on standby	Forward to primary, return result
semi-sync	Execute locally on standby	Forward to primary, return result
async	Execute locally on standby	Reject (traditional read-only)

Operations Subject to Routing

When connected to a standby in sync/semi-sync mode:

Operation	Behavior
`SELECT`	Execute locally (DQL)
`INSERT`	Forward to primary (DML)
`UPDATE`	Forward to primary (DML)
`DELETE`	Forward to primary (DML)
`CREATE`	Forward to primary (DDL)
`DROP`	Forward to primary (DDL)
`ALTER`	Forward to primary (DDL)
`TRUNCATE`	Forward to primary (DDL)

Example: Transparent Routing

-- Connect to STANDBY and execute INSERT (forwarded to primary)
INSERT INTO users VALUES (3, 'Charlie');
-- Result: INSERT 0 1 (success - executed on primary)

-- SELECT always executes locally on the connected standby
SELECT * FROM users;

Benefits

Load Distribution: Applications can connect to any node; reads distributed, writes auto-routed
Simplified Application Logic: No need for separate read/write connection strings
High Availability: Application continues working if it connects to standby
Transparent Failover: Combined with connection pooling, provides seamless failover

Monitoring

HA System Views

HeliosDB-Lite provides SQL system views for monitoring HA configuration and replication metrics.

pg_replication_status

View node configuration and role:

SELECT * FROM pg_replication_status;

Column	Description
`node_id`	Unique identifier for this node
`role`	primary, standby, observer, or standalone
`sync_mode`	async, semi-sync, or sync
`listen_address`	Host and port
`replication_port`	WAL streaming port
`current_lsn`	Current log sequence number
`is_read_only`	true/false
`standby_count`	Number of connected standbys (primary only)
`uptime_seconds`	Time since node started

pg_replication_standbys (Primary Only)

View connected standbys:

SELECT * FROM pg_replication_standbys;

Column	Description
`node_id`	Standby’s unique identifier
`address`	Standby’s connection address
`sync_mode`	Replication mode for this standby
`state`	connecting, streaming, catching_up, synced, disconnected
`current_lsn`	Standby’s current LSN position
`flush_lsn`	Flushed LSN
`apply_lsn`	Applied LSN
`lag_bytes`	Replication lag in bytes
`lag_ms`	Replication lag in milliseconds
`connected_at`	Connection timestamp
`last_heartbeat`	Last heartbeat received

pg_replication_primary (Standby Only)

View primary connection status:

SELECT * FROM pg_replication_primary;

Column	Description
`node_id`	Primary’s unique identifier
`address`	Primary’s address
`state`	disconnected, connecting, connected, streaming, error
`primary_lsn`	Primary’s current LSN
`local_lsn`	Local LSN position
`lag_bytes`	Replication lag in bytes
`lag_ms`	Replication lag in milliseconds
`fencing_token`	Split-brain protection token
`connected_at`	Connection timestamp
`last_heartbeat`	Last heartbeat received

pg_replication_metrics

View performance metrics:

SELECT * FROM pg_replication_metrics;

Column	Description
`wal_writes`	Total WAL write operations
`wal_bytes_written`	Total WAL bytes written
`records_replicated`	Records sent to standbys
`bytes_replicated`	Bytes sent to standbys
`heartbeats_sent`	Health check counts sent
`heartbeats_received`	Health check counts received
`reconnect_count`	Number of reconnections
`last_wal_write`	Timestamp of last WAL write
`last_replication`	Timestamp of last replication

Monitoring Examples

-- Check if standbys are in sync
SELECT
    node_id,
    CASE
        WHEN lag_ms < 1000 THEN 'IN_SYNC'
        WHEN lag_ms < 60000 THEN 'CATCHING_UP'
        ELSE 'LAGGING'
    END as status,
    lag_ms
FROM pg_replication_standbys;

-- View all nodes in cluster
SELECT node_id, role, current_lsn
FROM pg_replication_status;

Best Practices

Network: Use dedicated replication network
Monitoring: Alert on replication lag > threshold
Testing: Regularly test failover procedures
Backups: Continue point-in-time backups even with HA
Quorum: Use odd number of nodes for consensus

High Availability (HA)

High Availability (HA)

HA Tiers Overview

Feature Flags

Tier 1: Warm Standby

Architecture

Components

Configuration

Sync Modes

Failover

Split-Brain Protection

Tier 2: Multi-Primary

Architecture

Components

Conflict Resolution Strategies

Configuration

Branch-Based Replication

Tier 3: Sharding

Architecture

Components

Sharding Strategies

Configuration

Vector Partitioning

Resharding

Logical Replication

CLI Options

Docker Support

Transparent Write Routing (TWR)

How It Works

Behavior by Sync Mode

Operations Subject to Routing

Example: Transparent Routing

Benefits

Monitoring

HA System Views

pg_replication_status

pg_replication_standbys (Primary Only)

pg_replication_primary (Standby Only)

pg_replication_metrics

Monitoring Examples

Best Practices

See Also