Point-in-Time Recovery (PITR)

Point-in-Time Recovery — Restore to Any Second in the Retention Window

Crate: heliosdb-ha/crates/pitr (1,650 LOC) Modules: wal, snapshot, recovery, timeline, coordinator, storage, compression, verification Status: Production

UVP

When the dropped table happened at 14:32:07 and the backup ran at 02:00, you don’t want to lose 12 hours of data — you want to recover to 14:32:06. The Full edition’s PITR coordinator combines periodic snapshots with continuous WAL archival so you can restore to any LSN, any timestamp, any transaction ID, or any named recovery point inside the retention window. Configurable RPO down to 1 minute, RTO down to 5 minutes, parallel recovery workers, checksum verification, optional compression. No external backup service. No vendor lock-in. Local, S3, Azure, GCS — pick your archive directory.

Prerequisites

A running HeliosDB Full instance.
Disk space for the WAL + snapshot + archive directories.
Optional: object-storage destination if you don’t trust local disk.
About 20 minutes.

1. The Configuration

From pitr/src/lib.rs:

use heliosdb_pitr::{PITRConfig, RPO, RTO};
use std::path::PathBuf;

let config = PITRConfig {
    wal_directory:       PathBuf::from("/var/lib/heliosdb/wal"),
    snapshot_directory:  PathBuf::from("/var/lib/heliosdb/snapshots"),
    archive_directory:   PathBuf::from("/var/lib/heliosdb/archive"),
    rpo:                 RPO::OneMinute,
    rto:                 RTO::ThirtyMinutes,
    wal_segment_size_mb: 16,
    max_wal_segments:    100,
    snapshot_interval_secs: 3600,
    enable_compression:  true,
    enable_checksums:    true,
    recovery_workers:    4,
    archive_compression_level: 6,
};

Two enums to internalise:

pub enum RPO { OneMinute, FiveMinutes, FifteenMinutes, OneHour }   // max acceptable data loss
pub enum RTO { FiveMinutes, FifteenMinutes, ThirtyMinutes, OneHour } // max acceptable recovery time

The defaults give you RPO=1min, RTO=30min, hourly snapshots, 16MB WAL segments, gzip-6 archives. Fine for most production. Tighten as needed; nothing else changes.

2. Bring Up the Coordinator

use heliosdb_pitr::PITRCoordinator;
use std::sync::Arc;

let coordinator = PITRCoordinator::new(Arc::new(config));
coordinator.initialize().await?;

initialize() does three things:

Opens the WAL manager and rolls a new segment if needed.
Loads previously taken recovery points from the archive.
Wires the recovery engine to the WAL manager.

After this call, the system is archiving WAL continuously (subject to wal_segment_size_mb rollover) and creating snapshots every snapshot_interval_secs.

3. The Four Recovery Targets

Per recovery.rs:

pub enum RecoveryTarget {
    LSN(u64),                       // recover to a specific WAL log sequence number
    Timestamp(DateTime<Utc>),       // recover to a specific wall-clock time
    Transaction(u64),               // recover to just before / just after a txid
    Latest,                         // most recent recoverable point
    RecoveryPoint(String),          // a previously named point
}

Timestamp is the one you’ll use most. The others are forensic.

4. Recover to a Timestamp

use heliosdb_pitr::recovery::{RecoveryRequest, RecoveryTarget, RecoveryMode};
use chrono::{Utc, Duration};
use std::path::PathBuf;

let target_time = Utc::now() - Duration::minutes(13); // "14:32:06"

let request = RecoveryRequest {
    target: RecoveryTarget::Timestamp(target_time),
    mode: RecoveryMode::Full,
    tables: None,                                       // all tables
    target_directory: PathBuf::from("/var/lib/heliosdb-restored"),
    verify_checksums: true,
    workers: 4,
};

let stats = coordinator.recover(request).await?;
println!("Bytes processed:   {}", stats.bytes_processed);
println!("Records replayed:  {}", stats.records_processed);
println!("Recovery duration: {:?}", stats.duration);

The recovery engine:

Finds the most recent snapshot at or before the target time.
Restores the snapshot to target_directory.
Replays WAL records from the snapshot LSN up to the target timestamp.
Verifies checksums on every WAL record (since verify_checksums: true).
Returns once the target has been reached — earlier than the snapshot is unreachable, later requires fresh WAL.

It does not overwrite your live database. Recovery goes to a fresh target_directory. You promote it manually when you’re ready.

5. Recover Just One Table

let request = RecoveryRequest {
    target: RecoveryTarget::Timestamp(target_time),
    mode: RecoveryMode::Partial,
    tables: Some(vec!["orders".to_string(), "order_items".to_string()]),
    target_directory: PathBuf::from("/var/lib/heliosdb-restored-orders"),
    verify_checksums: true,
    workers: 4,
};
coordinator.recover(request).await?;

RecoveryMode::Partial tells the engine to skip WAL records that don’t touch the listed tables. Useful when only one table got nuked and you don’t want to replay everyone else’s last 12 hours of writes.

6. Validate Without Recovering

let request = RecoveryRequest {
    target: RecoveryTarget::Timestamp(target_time),
    mode: RecoveryMode::ValidationOnly,
    tables: None,
    target_directory: PathBuf::from("/tmp/throwaway"),
    verify_checksums: true,
    workers: 4,
};
coordinator.recover(request).await?;

ValidationOnly walks the WAL chain, verifies checksums, and confirms the target is reachable — without writing anything. Run this in your DR drill cron job to make sure your archives are actually usable. If checksums fail, you find out before the disaster.

7. Storage Backends

The crate ships with storage.rs abstracting the archive destination. Per the project memory and audit:

PITR supports S3, Azure, GCS, and Local — NOT AWS-only.

Set archive_directory to a path that your storage layer maps to your bucket:

Destination	`archive_directory` example
Local disk	`/var/lib/heliosdb/archive`
S3	`s3://my-bucket/heliosdb/pitr`
Azure Blob	`az://account/container/heliosdb`
GCS	`gs://my-bucket/heliosdb/pitr`

Compression (enable_compression: true) and archive compression level (0–9) are independent of the destination.

8. RPO/RTO Tuning

Goal	RPO	RTO	`snapshot_interval_secs`	`recovery_workers`
Compliance baseline	OneHour	OneHour	3600	4
Standard prod	OneMinute	ThirtyMinutes	3600	4
Tight SLA	OneMinute	FiveMinutes	600	8

Tightening RPO costs WAL archive bandwidth. Tightening RTO costs more snapshots (so there’s less WAL to replay) and more recovery workers.

The PITR engine raises PITRError::RTOExceeded { expected, actual } if a recovery overruns its target — useful for SLA monitoring.

9. Named Recovery Points

Before a risky deploy, mark a known-good point:

let point = coordinator.create_recovery_point("pre-v8.0.3-release").await?;
// ... deploy ...
// if it goes wrong:
let request = RecoveryRequest {
    target: RecoveryTarget::RecoveryPoint("pre-v8.0.3-release".to_string()),
    /* ... */
};

Named points survive WAL truncation as long as their underlying LSN is still inside max_wal_segments.

10. Plug into the Coordinator from SQL

Recovery is a binary-level operation; the SQL surface is for inspection:

-- See available recovery points
SELECT id, timestamp, wal_lsn, snapshot_id, size_bytes
FROM pg_recovery_points
ORDER BY timestamp DESC LIMIT 20;

-- See current WAL position
SELECT pg_current_wal_lsn(), pg_last_wal_replay_lsn();

The actual recover() call is run from the coordinator binary or via the Rust API — never from the live SQL session you’re trying to roll back.

Where Next

raft-setup.md — PITR rides on top of the per-node WAL.
multi-region-active-active.md — global PITR across regions.
Source: heliosdb-ha/crates/pitr/, modules recovery.rs, wal.rs, snapshot.rs, timeline.rs, verification.rs.