Multi-Region Active-Active Writes
Multi-Region Active-Active Writes — Sub-100ms Global Sync
Crate: heliosdb-cluster/crates/active-active (50+ tests, production), heliosdb-cluster/crates/multiregion (architecture-complete, example stubbed pending API)
Status: Active-active engine production; full multi-region setup-script example marked “requires API methods not yet implemented” in source — see Section 10.
ARR: $60M-$100M (per audit)
UVP
Postgres replication is single-leader. Spanner is closed-source. CockroachDB and YugabyteDB are leader-per-range. HeliosDB Full ships the first PostgreSQL-compatible database with true active-active writes — write to any region, read from any region, with strong, causal, eventual, or session consistency you pick per workload. Cross-region latency lands at ~100ms strong / ~10ms eventual on a 3-region setup, with last-write-wins / first-write-wins / region-priority conflict resolution baked in. The same SQL, three regions, no leader pinning.
Prerequisites
- A working 3-node Raft cluster — see raft-setup.md.
- Three regions (anything tagged
RegionId(0..n)— typically AWS regions, Azure zones, or GCP regions). - HeliosDB Full v7.x+ on every node.
- About 30 minutes.
1. Two Layers, One Database
The Full edition has two cooperating multi-region subsystems, both inside heliosdb-cluster/crates:
| Layer | Crate | Job |
|---|---|---|
| Active-active write engine | active-active | per-row writes, version vectors, conflict resolution, partition tolerance |
| Multi-region cluster manager | multiregion | region topology, WAL streaming, global 2PC, query routing, health monitor |
The active-active engine is what makes “write anywhere” actually work. The multiregion cluster manager is what wires regions together. They are designed to be used together.
2. Pick a Consistency Model
Per the active-active README, four are supported:
| Model | Latency (3 regions) | Use case |
|---|---|---|
Strong (2PC linearizable) | ~100ms | financial transactions, inventory |
Eventual (async replication) | ~10ms | social feeds, analytics, logs |
Causal (preserves causality) | ~50ms | comment threads, workflows |
Session (read-your-writes) | ~40ms | user sessions, shopping carts |
You pick this per ActiveActiveManager, not per query — choose based on the dominant workload.
3. Set Up the Active-Active Manager
use heliosdb_active_active::*;use chrono::Utc;
#[tokio::main]async fn main() -> Result<()> { let config = ActiveActiveConfig { consistency_model: ConsistencyModel::Strong, num_regions: 3, cross_region_timeout_ms: 100, ..Default::default() };
let manager = ActiveActiveManager::new(config).await?;
let request = WriteRequest { key: "user:1234".to_string(), value: b"user_data".to_vec(), version: VersionVector::new(), region: RegionId(0), timestamp: Utc::now(), causal_deps: None, session_id: None, };
let response = manager.write(request).await?; println!("Write OK: {:?}", response);
let metrics = manager.get_metrics().await; println!("Max latency: {}ms", metrics.max_latency_ms); Ok(())}Notes:
VersionVector::new()starts a fresh causality vector. The manager fills it in during commit.region: RegionId(0)means “this write originated in region 0”.cross_region_timeout_ms: 100is the per-region wait. Strong consistency uses 2-phase commit and waits for all participants up to this timeout.
4. Configure Conflict Resolution
When two regions write the same key in the same instant, somebody wins. Pick how:
// Last-write-wins — uses Hybrid Logical Clocklet config = ActiveActiveConfig { resolution_strategy: ResolutionStrategy::LastWriteWins, ..Default::default()};
// First-write-winslet config = ActiveActiveConfig { resolution_strategy: ResolutionStrategy::FirstWriteWins, ..Default::default()};
// Region priority (e.g. always prefer the primary region)let mut resolver = ConflictResolver::new(ResolutionStrategy::RegionPriority);resolver.set_region_priority(RegionId(0), 100); // Primaryresolver.set_region_priority(RegionId(1), 50); // Secondaryresolver.set_region_priority(RegionId(2), 25);LWW is the default. Region priority is ideal when you want one region authoritative for tie-breaks (e.g. a regulatory home region) without giving up active-active for everyone else.
5. Wire Up the Region Topology (multiregion crate)
This is the cluster-level config that tells the system which regions exist. From the multiregion README:
use heliosdb_multiregion::*;
let regions = vec![ RegionConfig { region_id: "us-east-1".to_string(), datacenter: "Virginia".to_string(), nodes: vec!["10.0.1.1:5432".to_string(), "10.0.1.2:5432".to_string()], is_primary: true, }, RegionConfig { region_id: "eu-west-1".to_string(), datacenter: "Ireland".to_string(), nodes: vec!["10.0.2.1:5432".to_string(), "10.0.2.2:5432".to_string()], is_primary: false, }, RegionConfig { region_id: "ap-south-1".to_string(), datacenter: "Mumbai".to_string(), nodes: vec!["10.0.3.1:5432".to_string(), "10.0.3.2:5432".to_string()], is_primary: false, },];
let config = ReplicationConfig { mode: ReplicationMode::ActiveActive, conflict_resolution: ConflictStrategy::LastWriteWins, consistency_level: ConsistencyLevel::Quorum, compression: true, encryption: true, max_lag_ms: 5000,};
let cluster = MultiRegionCluster::new_with_config(regions, config).await?;compression: true uses LZ4 on the WAL stream — typically 3-5x bandwidth reduction.
encryption: true encrypts cross-region traffic in transit (independent of TLS at the listener).
6. Issue a Global Transaction
Multi-region 2PC across three datacenters:
let mut txn = cluster.begin_global_transaction().await?;
txn.add_operation(Operation::Write { key: "order:42".to_string(), value: b"{\"status\":\"paid\"}".to_vec(),});
txn.add_operation(Operation::Delete { key: "cart:user:99".to_string(),});
cluster.commit_global(txn).await?;commit_global runs the prepare phase on every region. If any region votes No, all regions abort. Default txn timeout is 30 seconds; raise it with txn.timeout_secs = 60 if your cross-region RTT is unforgiving.
7. Region-Aware Query Routing
Reads can be routed to the user’s nearest region:
// Pin to user's regionlet target = cluster .route_query("SELECT * FROM users WHERE id = $1", Some("eu-west-1")) .await?;
// Or let the router pick the lowest-latency regionlet target = cluster.route_query("SELECT * FROM users WHERE id = $1", None).await?;route_query(_, None) picks based on the configured router (latency / load / policy).
8. Watch Replication Lag
for status in cluster.get_all_region_status().await? { if status.lag_ms > 5000 { eprintln!("⚠ High lag in {}: {}ms", status.region_id, status.lag_ms); }}Wire this into your alerting. max_lag_ms: 5000 in the config above is the trigger — above it, the region is considered “lagging” but still serving reads. Above 2 * max_lag_ms (configurable) reads can be auto-routed away.
9. Network Partitions
Active-active is designed to survive partitions:
let affected = vec![RegionId(2), RegionId(3)];manager.handle_partition(affected.clone()).await?;// system continues serving from regions 0 and 1
// when network heals:manager.recover_from_partition(affected).await?;// automatic reconciliation of divergent writes via the configured resolverQuorum-based decision making prevents split-brain — if you can’t reach a majority, you can’t accept writes (unless you’re in Eventual mode, where stale-but-available is the deal).
10. Honest Status: What’s Production, What’s Architecture-Complete
Per the source-of-truth check:
active-activecrate: 50+ tests, production-ready per its README (“first Postgres-compatible database to offer genuine active-active capabilities”).multiregioncrate: Library and types are production code, but the example fileexamples/multiregion_setup.rsis currently a stub: “This example requires the ‘multiregion-example’ feature… requires API methods and types not yet fully implemented.” Production deployments today run via theMultiRegionClusterAPI directly — no end-to-end CLI driver yet.- Per the audit, this is Tier 3 coverage (architecture exists, no full walkthrough). Treat the snippets above as a real API surface but not a copy-paste deploy script.
11. Performance Reference
From the active-active crate’s published benches:
| Scenario | Latency | Throughput |
|---|---|---|
| Strong, 1 region | 50ms | 50K ops/sec |
| Strong, 3 regions | 100ms | 30K ops/sec |
| Eventual, 3 regions | 10ms | 100K ops/sec |
| Concurrent (100 threads) | 80ms | 50K ops/sec |
Rerun on your own topology with cargo bench inside the crate.
Where Next
- raft-setup.md — single-region quorum.
- sharding-config.md — combine with horizontal sharding for petabyte scale.
- pitr-recovery.md — global PITR across regions.
- Source:
heliosdb-cluster/crates/active-active/README.md,heliosdb-cluster/crates/multiregion/README.md