Skip to content

Zero-Downtime Schema Migration Architecture

Zero-Downtime Schema Migration Architecture

Document Version: 1.0 Created: November 9, 2025 Status: Phase 2 Architecture Design Author: System Architecture Designer


Executive Summary

This document defines the complete architecture for HeliosDB’s zero-downtime schema migration system, enabling online DDL operations during live traffic without table locks or application downtime. The system is inspired by proven tools like GitHub’s gh-ost and Percona’s pt-online-schema-change, but extends them with distributed capabilities, AI-driven optimization, and native integration with HeliosDB’s multi-model architecture.

Key Capabilities

  • Online DDL: ADD/DROP/MODIFY COLUMN, ADD/DROP INDEX without blocking operations
  • Ghost Table Approach: Shadow table creation with trigger-based replication
  • Atomic Cutover: Millisecond switchover with automatic rollback
  • Zero Data Loss: Transaction-consistent migration with validation
  • Performance Monitoring: Real-time overhead tracking and throttling
  • Multi-Model Support: Works across document, graph, time-series, and relational models

1. System Architecture

1.1 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────┐
│ Application Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ SQL Client │ │ GraphQL API │ │ REST API │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └─────────────────┴─────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Migration Coordinator (Router) │ │
│ │ • Route to original or ghost table │ │
│ │ • Handle dual-write during migration │ │
│ └───────────────┬─────────────────────────────┘ │
│ │ │
└─────────────────────────┼───────────────────────────────────────────┘
┌─────────────────────────┼───────────────────────────────────────────┐
│ Zero-Downtime Migration Engine │
│ │ │
│ ┌──────────────────────▼──────────────────────────┐ │
│ │ Migration Orchestrator │ │
│ │ • State machine management │ │
│ │ • Phase transitions │ │
│ │ • Rollback coordination │ │
│ └──┬────────────────┬────────────────┬────────────┘ │
│ │ │ │ │
│ ┌──▼────────┐ ┌───▼──────┐ ┌─────▼──────┐ ┌─────────────┐ │
│ │ Ghost │ │ Copy │ │ Trigger │ │ Validator │ │
│ │ Table │ │ Engine │ │ Manager │ │ Engine │ │
│ │ Manager │ │ │ │ │ │ │ │
│ └───────────┘ └──────────┘ └────────────┘ └─────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Migration State Store │ │
│ │ • Migration metadata (state, progress, metrics) │ │
│ │ • Cutover checkpoints │ │
│ │ • Rollback snapshots │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────┼───────────────────────────────────────────┐
│ Storage Layer │
│ │ │
│ ┌──────────────────────▼──────────────────────────┐ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Original │ │ Ghost │ │ │
│ │ │ Table │ │ Table │ │ │
│ │ │ │ │ (_new_xxx) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ Trigger Layer (Dual Write) │ │ │
│ │ │ • INSERT/UPDATE/DELETE capture │ │ │
│ │ │ • Async replication to ghost │ │ │
│ │ └──────────────────────────────────────┘ │ │
│ │ │ │
│ │ HeliosDB Storage Engine │ │
│ │ (LSM Tree, WAL, MVCC, Transaction Manager) │ │
│ └──────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

1.2 Component Breakdown

1.2.1 Migration Coordinator

  • Purpose: Routes queries to appropriate table during migration
  • Responsibilities:
    • Query routing (original vs ghost table)
    • Dual-write coordination
    • Read/write splitting during cutover
    • Performance monitoring and throttling

1.2.2 Migration Orchestrator

  • Purpose: Manages migration lifecycle and state transitions
  • Responsibilities:
    • State machine execution
    • Phase orchestration (Planning → Executing → Validating → Cutover)
    • Error handling and rollback coordination
    • Progress tracking and reporting

1.2.3 Ghost Table Manager

  • Purpose: Creates and manages shadow tables
  • Responsibilities:
    • Ghost table creation with new schema
    • Naming convention (_helios_ghost_{table_name}_{timestamp})
    • Cleanup after successful cutover
    • Preservation for rollback

1.2.4 Copy Engine

  • Purpose: Efficiently copy data from original to ghost table
  • Responsibilities:
    • Chunked, resumable bulk copy
    • Throttling to avoid overwhelming storage
    • Progress tracking (rows copied, estimated time remaining)
    • Delta detection for incremental catch-up

1.2.5 Trigger Manager

  • Purpose: Synchronize ongoing changes during migration
  • Responsibilities:
    • Install INSERT/UPDATE/DELETE triggers on original table
    • Capture changes to changelog table
    • Async replication to ghost table
    • Conflict resolution (last-write-wins)

1.2.6 Validator Engine

  • Purpose: Ensure data consistency between original and ghost
  • Responsibilities:
    • Row count validation
    • Checksum comparison (chunk-wise)
    • Sample-based deep comparison
    • Index validation

1.2.7 Migration State Store

  • Purpose: Persistent storage for migration metadata
  • Responsibilities:
    • Migration state persistence
    • Progress checkpoints
    • Rollback snapshots
    • Audit trail

2. Migration Lifecycle State Machine

2.1 State Diagram

┌──────────────┐
│ CREATED │
└──────┬───────┘
│ start_migration()
┌──────────────┐
│ PLANNING │──────────► FAILED (validation error)
└──────┬───────┘
│ validate()
┌──────────────┐
│ PREPARING │
│ Ghost Table │
└──────┬───────┘
│ create_ghost_table()
┌──────────────┐
┌───────────►│ COPYING │──────────► PAUSED
│ │ Bulk Data │◄──────────┘ (resume_migration)
│ └──────┬───────┘
│ │ bulk_copy_complete()
│ ▼
│ ┌──────────────┐
│ │SYNCHRONIZING │
│ │ Incremental │
│ └──────┬───────┘
│ │ catch_up_complete()
│ ▼
│ ┌──────────────┐
│ │ VALIDATING │──────────► FAILED (validation failed)
│ └──────┬───────┘ │
│ │ validation_passed() │
│ ▼ │
│ ┌──────────────┐ │
│ │ READY FOR │ │
│ │ CUTOVER │ │
│ └──────┬───────┘ │
│ │ cutover() │
│ ▼ │
│ ┌──────────────┐ │
│ ┌───┤ CUTOVER │ │
│ │ │ In Progress │ │
│ │ └──────┬───────┘ │
│ │ │ cutover_complete() │
│ │ ▼ │
│ │ ┌──────────────┐ │
│ │ │ COMPLETED │ │
│ │ └──────────────┘ │
│ │ │
│ │ │
│ rollback() rollback()
│ │ │
│ ▼ ▼
│ ┌──────────────┐ ┌──────────────┐
└────┤ ROLLING │◄─────────────┤ FAILED │
│ BACK │ └──────────────┘
└──────┬───────┘
│ rollback_complete()
┌──────────────┐
│ ROLLED BACK │
└──────────────┘

2.2 State Transitions

Current StateEventNext StateReversible
CREATEDstart_migration()PLANNINGYes
PLANNINGvalidate()PREPARINGYes
PREPARINGcreate_ghost_table()COPYINGYes
COPYINGbulk_copy_complete()SYNCHRONIZING❌ No*
COPYINGpause_migration()PAUSEDYes
PAUSEDresume_migration()COPYINGYes
SYNCHRONIZINGcatch_up_complete()VALIDATING❌ No*
VALIDATINGvalidation_passed()READY FOR CUTOVERYes
READY FOR CUTOVERcutover()CUTOVERYes (for 5 min)
CUTOVERcutover_complete()COMPLETED❌ No
ANYrollback()ROLLING BACK-
ROLLING BACKrollback_complete()ROLLED BACK❌ No

*Can rollback but will lose incremental copy progress

2.3 Rollback Points

The system supports rollback at different stages with varying complexity:

  1. CREATED → PLANNING → PREPARING: Instant rollback

    • No data copied, no triggers installed
    • Simply drop ghost table (if created)
  2. COPYING → SYNCHRONIZING: Fast rollback

    • Drop triggers
    • Drop ghost table
    • Update migration state
    • Estimated time: <10 seconds
  3. VALIDATING → READY FOR CUTOVER: Fast rollback

    • Same as above
    • Validation data already discarded
  4. CUTOVER (first 5 minutes): Emergency rollback

    • Requires table rename swap back
    • Reapply missed writes from changelog
    • Estimated time: <30 seconds
    • Hard limit: 5 minutes, after that rollback is NOT safe
  5. COMPLETED: No rollback

    • Original table dropped
    • Must create new migration to revert schema

3. Ghost Table Approach

3.1 Ghost Table Lifecycle

The ghost table approach is inspired by GitHub’s gh-ost but adapted for HeliosDB’s distributed architecture:

Step 1: Create Ghost Table
┌─────────────────────────────────────┐
│ Original Table: users │
│ ┌─────────┬──────────┬──────────┐ │
│ │ id (PK) │ name │ email │ │
│ ├─────────┼──────────┼──────────┤ │
│ │ 1 │ Alice │ a@ex.com │ │
│ │ 2 │ Bob │ b@ex.com │ │
│ └─────────┴──────────┴──────────┘ │
└─────────────────────────────────────┘
│ CREATE GHOST TABLE
┌─────────────────────────────────────────────┐
│ Ghost Table: _helios_ghost_users_20251109 │
│ ┌─────────┬──────────┬──────────┬────────┐ │
│ │ id (PK) │ name │ email │ phone │ │ ← New column
│ ├─────────┼──────────┼──────────┼────────┤ │
│ │ (empty) │ │ │ │ │
│ └─────────┴──────────┴──────────┴────────┘ │
└─────────────────────────────────────────────┘
Step 2: Install Triggers (Dual Write)
┌─────────────────────────────────────┐
│ Original Table: users │
│ ┌─────────────────────────────┐ │
│ │ Triggers Installed: │ │
│ │ • INSERT → Changelog │ │
│ │ • UPDATE → Changelog │ │
│ │ • DELETE → Changelog │ │
│ └─────────────────────────────┘ │
│ ┌─────────┬──────────┬──────────┐ │
│ │ id (PK) │ name │ email │ │
│ ├─────────┼──────────┼──────────┤ │
│ │ 1 │ Alice │ a@ex.com │ │
│ │ 2 │ Bob │ b@ex.com │ │
│ └─────────┴──────────┴──────────┘ │
└─────────────┬───────────────────────┘
│ Async Replication
┌──────────────────────────────────────┐
│ Changelog Table │
│ ┌────────┬─────┬─────────────────┐ │
│ │ lsn │ op │ data │ │
│ ├────────┼─────┼─────────────────┤ │
│ │ 100001 │ INS │ {id:3,name:...} │ │
│ │ 100002 │ UPD │ {id:1,email:..} │ │
│ └────────┴─────┴─────────────────┘ │
└──────────────────────────────────────┘
Step 3: Bulk Copy (Chunked)
┌─────────────────────────────────────┐
│ Original Table: users (100M rows) │
│ ┌─────────┬──────────┬──────────┐ │
│ │ id (PK) │ name │ email │ │
│ └─────────┴──────────┴──────────┘ │
└─────────────┬───────────────────────┘
│ Chunked Copy (10K rows/chunk)
│ Throttled (max 50% I/O)
┌─────────────────────────────────────────────┐
│ Ghost Table: _helios_ghost_users_20251109 │
│ ┌─────────┬──────────┬──────────┬────────┐ │
│ │ id (PK) │ name │ email │ phone │ │
│ ├─────────┼──────────┼──────────┼────────┤ │
│ │ 1 │ Alice │ a@ex.com │ NULL │ │
│ │ 2 │ Bob │ b@ex.com │ NULL │ │
│ │ ... │ ... │ ... │ NULL │ │
│ │ (99.9% complete) │ │
│ └─────────┴──────────┴──────────┴────────┘ │
└─────────────────────────────────────────────┘
Step 4: Catch-up (Apply Changelog)
┌──────────────────────────────────────┐
│ Changelog Table │
│ Apply to ghost table in LSN order │
│ ┌────────┬─────┬─────────────────┐ │
│ │ lsn │ op │ data │ │
│ ├────────┼─────┼─────────────────┤ │
│ │ 100001 │ INS │ {id:3,name:...} │ │ ──┐
│ │ 100002 │ UPD │ {id:1,email:..} │ │ │
│ └────────┴─────┴─────────────────┘ │ │
└──────────────────────────────────────┘ │
│ Apply
┌─────────────────────────────────────────────┐
│ Ghost Table: _helios_ghost_users_20251109 │
│ ┌─────────┬──────────┬──────────┬────────┐ │
│ │ id (PK) │ name │ email │ phone │ │
│ ├─────────┼──────────┼──────────┼────────┤ │
│ │ 1 │ Alice │ new@... │ NULL │ │ ← Updated
│ │ 2 │ Bob │ b@ex.com │ NULL │ │
│ │ 3 │ Charlie │ c@ex.com │ NULL │ │ ← Inserted
│ │ (100% in sync) │ │
│ └─────────┴──────────┴──────────┴────────┘ │
└─────────────────────────────────────────────┘
Step 5: Atomic Cutover (Rename Swap)
BEGIN TRANSACTION;
-- Lock both tables briefly (milliseconds)
LOCK TABLE users, _helios_ghost_users_20251109;
-- Apply final changelog entries
-- (should be <100 rows accumulated during lock)
-- Atomic rename
ALTER TABLE users RENAME TO _helios_old_users_20251109;
ALTER TABLE _helios_ghost_users_20251109 RENAME TO users;
-- Update metadata catalog
UPDATE helios_migration_state SET status = 'COMPLETED';
COMMIT;
Post-Cutover:
┌─────────────────────────────────────────────┐
│ New Table: users (with new schema) │
│ ┌─────────┬──────────┬──────────┬────────┐ │
│ │ id (PK) │ name │ email │ phone │ │
│ ├─────────┼──────────┼──────────┼────────┤ │
│ │ 1 │ Alice │ new@... │ NULL │ │
│ │ 2 │ Bob │ b@ex.com │ NULL │ │
│ │ 3 │ Charlie │ c@ex.com │ NULL │ │
│ └─────────┴──────────┴──────────┴────────┘ │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Backup: _helios_old_users_20251109 │
│ (kept for 24 hours for emergency rollback) │
└─────────────────────────────────────────────┘

3.2 Trigger-Based Synchronization

HeliosDB uses a changelog table approach to capture changes:

-- Changelog table structure
CREATE TABLE _helios_changelog_{table_name}_{timestamp} (
lsn BIGINT PRIMARY KEY, -- Log Sequence Number
operation VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE
pk_value TEXT NOT NULL, -- Primary key of affected row
data JSONB, -- Full row data (for INSERT/UPDATE)
timestamp TIMESTAMP NOT NULL, -- When change occurred
applied BOOLEAN DEFAULT false -- Has been applied to ghost
);
-- Trigger on original table (pseudo-code)
CREATE TRIGGER capture_changes
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW
BEGIN
INSERT INTO _helios_changelog_users_20251109 (
lsn, operation, pk_value, data, timestamp
)
VALUES (
next_lsn(),
operation_type,
NEW.id (or OLD.id for DELETE),
row_to_json(NEW),
NOW()
);
END;

Replication Worker (async process):

  • Polls changelog table for applied = false
  • Applies changes to ghost table in LSN order
  • Marks as applied = true
  • Handles conflicts (last-write-wins based on timestamp)
  • Throttles to avoid overwhelming storage

4. Migration Operations Support

4.1 Supported DDL Operations

OperationComplexityDowntimeData CopyNotes
ADD COLUMN (nullable)Low0msNoMetadata-only change
ADD COLUMN (NOT NULL)Medium~msYesRequires default or backfill
DROP COLUMNLow~msNoMetadata-only, data ignored
MODIFY COLUMN (type)High~msYesFull table rewrite if incompatible
RENAME COLUMNLow0msNoMetadata-only change
ADD INDEXMedium0msNoBuilt concurrently on ghost
DROP INDEXLow0msNoMetadata-only change
ADD PRIMARY KEYHigh~msYesFull table rewrite
DROP PRIMARY KEYHigh~msYesFull table rewrite
ADD FOREIGN KEYMedium~msNoValidation required
DROP FOREIGN KEYLow0msNoMetadata-only change
ADD UNIQUE CONSTRAINTMedium~msNoValidation + index build
DROP UNIQUE CONSTRAINTLow0msNoMetadata-only change

4.2 Operation Implementation Details

4.2.1 ADD COLUMN (nullable)

Scenario: Add optional phone column to users table.

Strategy: Fast Path - Metadata Only

-- Original command
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- HeliosDB internal execution
BEGIN TRANSACTION;
-- Update schema metadata
UPDATE helios_catalog.columns
SET schema_version = schema_version + 1
WHERE table_name = 'users';
INSERT INTO helios_catalog.columns (
table_name, column_name, data_type, nullable, default_value
) VALUES (
'users', 'phone', 'VARCHAR(20)', true, NULL
);
-- No data migration needed
-- Existing rows will return NULL for phone column
COMMIT;

Overhead: <10ms (metadata write only)

4.2.2 ADD COLUMN (NOT NULL with default)

Scenario: Add mandatory created_at column with default value.

Strategy: Ghost Table Migration

-- Original command
ALTER TABLE users ADD COLUMN created_at TIMESTAMP NOT NULL DEFAULT NOW();
-- HeliosDB internal execution
-- Step 1: Create ghost table with new column
CREATE TABLE _helios_ghost_users_20251109 AS
SELECT *, NOW() AS created_at FROM users WHERE 1=0;
-- Step 2: Install triggers for dual write
-- Step 3: Bulk copy with default value
INSERT INTO _helios_ghost_users_20251109
SELECT *, NOW() AS created_at
FROM users
ORDER BY id
LIMIT 10000; -- Chunked
-- Step 4: Catch-up from changelog
-- Step 5: Atomic cutover

Overhead: Proportional to table size, ~1-5 minutes for 100M rows

4.2.3 MODIFY COLUMN (type change)

Scenario: Change user_id from INT to BIGINT.

Strategy: Ghost Table with Type Casting

-- Original command
ALTER TABLE users MODIFY COLUMN user_id BIGINT;
-- HeliosDB internal execution
-- Create ghost table with new type
CREATE TABLE _helios_ghost_users_20251109 (
user_id BIGINT, -- Changed from INT
name VARCHAR(255),
email VARCHAR(255)
);
-- Copy with casting
INSERT INTO _helios_ghost_users_20251109
SELECT CAST(user_id AS BIGINT), name, email
FROM users
LIMIT 10000; -- Chunked

Type Compatibility Matrix:

From → ToSafe?Strategy
INT → BIGINTYesDirect cast
BIGINT → INT⚠ WarningCast with overflow check
VARCHAR(N) → VARCHAR(M) where M>NYesDirect
VARCHAR(N) → VARCHAR(M) where M<N❌ NoError if data > M
TIMESTAMP → DATE⚠ WarningTruncation (time lost)
JSON → TEXTYesStringify
TEXT → JSON⚠ WarningParse (may fail)

4.2.4 ADD INDEX

Scenario: Add index on email column.

Strategy: Concurrent Index Build

-- Original command
CREATE INDEX idx_users_email ON users(email);
-- HeliosDB internal execution
-- Build index on ghost table while it's being populated
-- No need for triggers since we're adding index, not changing data
CREATE INDEX CONCURRENTLY idx_users_email
ON _helios_ghost_users_20251109(email);

Overhead: Index build time (~1-10 min for 100M rows)

4.2.5 DROP COLUMN

Scenario: Remove deprecated legacy_field column.

Strategy: Ghost Table without Column

-- Original command
ALTER TABLE users DROP COLUMN legacy_field;
-- HeliosDB internal execution
CREATE TABLE _helios_ghost_users_20251109 (
user_id BIGINT,
name VARCHAR(255),
email VARCHAR(255)
-- legacy_field omitted
);
-- Copy only required columns
INSERT INTO _helios_ghost_users_20251109
SELECT user_id, name, email -- Exclude legacy_field
FROM users
LIMIT 10000;

Overhead: Minimal (column ignored during copy)


5. Rollback Capabilities

5.1 One-Click Rollback Design

Philosophy: Rollback should be as easy as starting the migration.

Implementation:

// Simple rollback API
migration_engine.rollback(migration_id)?;
// Or via SQL
ROLLBACK MIGRATION 'migration-uuid-1234';

5.2 Rollback Mechanisms by Phase

Phase 1-3 (PLANNING → COPYING): Drop Ghost Table

async fn rollback_early_phase(migration_id: &str) -> Result<()> {
// 1. Drop triggers if installed
drop_triggers(migration_id).await?;
// 2. Drop ghost table
let ghost_table = format!("_helios_ghost_{}", migration_id);
execute_sql(&format!("DROP TABLE IF EXISTS {}", ghost_table)).await?;
// 3. Drop changelog table
let changelog_table = format!("_helios_changelog_{}", migration_id);
execute_sql(&format!("DROP TABLE IF EXISTS {}", changelog_table)).await?;
// 4. Update migration state
update_migration_state(migration_id, MigrationState::RolledBack).await?;
Ok(())
}

Time: <10 seconds Risk: None (no production data affected)

Phase 4 (VALIDATING → READY): Same as Phase 1-3

Phase 5 (CUTOVER - First 5 Minutes): Emergency Rename Swap

async fn rollback_cutover(migration_id: &str) -> Result<()> {
// Check if within 5-minute window
let cutover_time = get_cutover_timestamp(migration_id).await?;
let elapsed = Utc::now() - cutover_time;
if elapsed > Duration::minutes(5) {
return Err(Error::RollbackWindowExpired);
}
// Begin transaction
let txn = begin_transaction().await?;
// 1. Lock both tables
txn.execute("LOCK TABLE users, _helios_old_users_20251109").await?;
// 2. Swap names back
txn.execute("ALTER TABLE users RENAME TO _helios_temp_broken").await?;
txn.execute("ALTER TABLE _helios_old_users_20251109 RENAME TO users").await?;
txn.execute("DROP TABLE _helios_temp_broken").await?;
// 3. Apply missed writes from changelog
let changelog = get_cutover_changelog(migration_id).await?;
for entry in changelog {
apply_changelog_entry(&txn, &entry).await?;
}
// 4. Update metadata
txn.execute(&format!(
"UPDATE helios_migration_state SET status = 'ROLLED_BACK' WHERE id = '{}'",
migration_id
)).await?;
txn.commit().await?;
Ok(())
}

Time: 10-30 seconds Risk: Low (within 5-minute window) Data Loss: None (changelog replay ensures consistency)

Phase 6 (COMPLETED): No Rollback

Once migration is marked COMPLETED and old table is dropped, rollback is NOT possible. To revert schema changes:

  1. Create a new migration with inverse operations
  2. Execute new migration to restore old schema

Recommendation: Keep _helios_old_{table} for 24 hours before dropping.

5.3 Rollback Safety Checks

pub struct RollbackSafetyChecker {
migration_state: MigrationState,
cutover_timestamp: Option<DateTime<Utc>>,
old_table_exists: bool,
changelog_exists: bool,
}
impl RollbackSafetyChecker {
pub fn can_rollback(&self) -> Result<RollbackFeasibility> {
match self.migration_state {
MigrationState::Planning |
MigrationState::Preparing |
MigrationState::Copying => {
Ok(RollbackFeasibility::Safe {
estimated_time_seconds: 10,
data_loss_risk: DataLossRisk::None,
})
}
MigrationState::Synchronizing |
MigrationState::Validating |
MigrationState::ReadyForCutover => {
Ok(RollbackFeasibility::Safe {
estimated_time_seconds: 15,
data_loss_risk: DataLossRisk::Low,
})
}
MigrationState::Cutover => {
if let Some(cutover_time) = self.cutover_timestamp {
let elapsed = Utc::now() - cutover_time;
if elapsed <= Duration::minutes(5) {
Ok(RollbackFeasibility::Risky {
estimated_time_seconds: 30,
data_loss_risk: DataLossRisk::Low,
warnings: vec![
"Within emergency rollback window".to_string(),
"Some queries may fail during rollback".to_string(),
],
})
} else {
Err(Error::RollbackWindowExpired)
}
} else {
Err(Error::InvalidState)
}
}
MigrationState::Completed => {
if self.old_table_exists {
Ok(RollbackFeasibility::Possible {
estimated_time_seconds: 300,
data_loss_risk: DataLossRisk::High,
warnings: vec![
"Migration completed, old table still exists".to_string(),
"Rollback will require new migration".to_string(),
],
})
} else {
Err(Error::CannotRollback {
reason: "Old table dropped, rollback impossible".to_string(),
})
}
}
_ => Err(Error::InvalidState),
}
}
}

6. Migration Testing Framework

6.1 Shadow Traffic Validation

Before cutover, HeliosDB can validate the ghost table using shadow traffic:

┌─────────────────────────────────────────────────────────────┐
│ Application Traffic │
└───────────────────────┬─────────────────────────────────────┘
┌──────────────────────────────┐
│ Migration Coordinator │
│ (Shadow Mode Enabled) │
└───────┬──────────────┬───────┘
│ │
Primary │ │ Shadow (read-only)
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Original Table │ │ Ghost Table │
│ (users) │ │ (_new_users) │
└─────────────────┘ └─────────────────┘
│ │
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Return Result │ │ Compare Result │
│ to Application │ │ Log Diff │
└─────────────────┘ └─────────────────┘

Shadow Traffic Mode:

  • All read queries executed against BOTH tables
  • Results compared for consistency
  • Differences logged for analysis
  • Only original table results returned to application
  • Zero impact on application (async comparison)

Validation Metrics:

  • Consistency Rate: % of queries with matching results
  • Performance Delta: Ghost table vs original (latency, throughput)
  • Data Completeness: Row count, checksum comparison

6.2 Automated Testing Before Cutover

pub struct MigrationTestSuite {
migration_id: String,
original_table: String,
ghost_table: String,
}
impl MigrationTestSuite {
/// Run comprehensive pre-cutover tests
pub async fn run_all_tests(&self) -> Result<TestReport> {
let mut report = TestReport::new();
// Test 1: Row count validation
report.add_result(self.test_row_count().await?);
// Test 2: Checksum comparison
report.add_result(self.test_checksums().await?);
// Test 3: Sample-based deep comparison
report.add_result(self.test_sample_comparison(1000).await?);
// Test 4: Index validation
report.add_result(self.test_index_consistency().await?);
// Test 5: Constraint validation
report.add_result(self.test_constraints().await?);
// Test 6: Performance comparison
report.add_result(self.test_performance().await?);
// Test 7: Shadow traffic validation (if enabled)
if self.is_shadow_mode_enabled() {
report.add_result(self.test_shadow_traffic().await?);
}
Ok(report)
}
async fn test_row_count(&self) -> Result<TestResult> {
let original_count = query_one::<i64>(&format!(
"SELECT COUNT(*) FROM {}", self.original_table
)).await?;
let ghost_count = query_one::<i64>(&format!(
"SELECT COUNT(*) FROM {}", self.ghost_table
)).await?;
Ok(TestResult {
name: "Row Count Validation".to_string(),
passed: original_count == ghost_count,
message: format!(
"Original: {}, Ghost: {}",
original_count, ghost_count
),
})
}
async fn test_checksums(&self) -> Result<TestResult> {
// Compute checksums for 100 chunks
let chunk_size = self.get_table_size().await? / 100;
let mut mismatches = Vec::new();
for i in 0..100 {
let offset = i * chunk_size;
let original_checksum = self.compute_checksum(
&self.original_table, offset, chunk_size
).await?;
let ghost_checksum = self.compute_checksum(
&self.ghost_table, offset, chunk_size
).await?;
if original_checksum != ghost_checksum {
mismatches.push(i);
}
}
Ok(TestResult {
name: "Checksum Comparison".to_string(),
passed: mismatches.is_empty(),
message: if mismatches.is_empty() {
"All chunks match".to_string()
} else {
format!("Mismatched chunks: {:?}", mismatches)
},
})
}
async fn test_sample_comparison(&self, sample_size: usize) -> Result<TestResult> {
// Randomly sample rows and compare
let sample_ids = self.get_random_sample_ids(sample_size).await?;
let mut mismatches = 0;
for id in sample_ids {
let original_row = self.get_row(&self.original_table, id).await?;
let ghost_row = self.get_row(&self.ghost_table, id).await?;
if original_row != ghost_row {
mismatches += 1;
}
}
Ok(TestResult {
name: "Sample Comparison".to_string(),
passed: mismatches == 0,
message: format!(
"Sampled {} rows, {} mismatches",
sample_size, mismatches
),
})
}
async fn test_performance(&self) -> Result<TestResult> {
// Run representative queries on both tables
let queries = vec![
"SELECT * FROM {} WHERE id = 12345",
"SELECT COUNT(*) FROM {} WHERE created_at > NOW() - INTERVAL '1 day'",
"SELECT AVG(value) FROM {} GROUP BY category",
];
let mut original_times = Vec::new();
let mut ghost_times = Vec::new();
for query_template in queries {
let original_query = query_template.replace("{}", &self.original_table);
let ghost_query = query_template.replace("{}", &self.ghost_table);
let original_time = measure_query_time(&original_query).await?;
let ghost_time = measure_query_time(&ghost_query).await?;
original_times.push(original_time);
ghost_times.push(ghost_time);
}
let avg_original = original_times.iter().sum::<f64>() / original_times.len() as f64;
let avg_ghost = ghost_times.iter().sum::<f64>() / ghost_times.len() as f64;
let performance_delta = ((avg_ghost - avg_original) / avg_original) * 100.0;
// Pass if ghost table is within 20% of original performance
let passed = performance_delta.abs() < 20.0;
Ok(TestResult {
name: "Performance Comparison".to_string(),
passed,
message: format!(
"Original: {:.2}ms, Ghost: {:.2}ms, Delta: {:.1}%",
avg_original, avg_ghost, performance_delta
),
})
}
}

6.3 Migration Dry Run

// Execute migration in dry-run mode
let migration = MigrationBuilder::new()
.table("users")
.add_column("phone", DataType::VarChar(20), nullable=true)
.dry_run(true) // No actual changes
.build();
let result = migration_engine.execute(migration).await?;
// Dry run provides:
// - Estimated duration
// - Resource requirements (disk space, I/O)
// - Potential issues
// - Rollback feasibility

7. API Design

7.1 SQL Syntax

7.1.1 Start Online Migration

-- Explicit online migration
ALTER TABLE users
ADD COLUMN phone VARCHAR(20) NULL
WITH (MIGRATION_MODE = 'ONLINE');
-- Shorthand (online by default)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Force offline migration (faster for large tables with downtime tolerance)
ALTER TABLE users
ADD COLUMN phone VARCHAR(20)
WITH (MIGRATION_MODE = 'OFFLINE');

7.1.2 Migration Options

ALTER TABLE users MODIFY COLUMN user_id BIGINT
WITH (
MIGRATION_MODE = 'ONLINE',
CHUNK_SIZE = 10000, -- Rows per chunk
THROTTLE_IO = 50, -- Max % of I/O capacity
ENABLE_SHADOW_TRAFFIC = true, -- Validate with shadow traffic
ROLLBACK_RETENTION = '24h' -- Keep old table for 24 hours
);

7.1.3 Monitor Migration

-- Show all active migrations
SHOW MIGRATIONS;
-- Show specific migration details
SHOW MIGRATION 'migration-uuid-1234';
-- Output:
-- Migration ID: migration-uuid-1234
-- Table: users
-- Operation: ADD COLUMN phone VARCHAR(20)
-- Status: COPYING
-- Progress: 67% (67M/100M rows)
-- Elapsed Time: 00:15:32
-- Estimated Remaining: 00:07:45
-- Current Phase: Bulk Copy
-- Rollback Available: Yes

7.1.4 Control Migration

-- Pause migration (useful for maintenance windows)
PAUSE MIGRATION 'migration-uuid-1234';
-- Resume paused migration
RESUME MIGRATION 'migration-uuid-1234';
-- Rollback migration
ROLLBACK MIGRATION 'migration-uuid-1234';
-- Force cutover (skip validation)
CUTOVER MIGRATION 'migration-uuid-1234' WITH (FORCE = true);

7.2 Programmatic API (Rust)

use heliosdb::migration::{MigrationEngine, MigrationBuilder, MigrationMode};
// Create migration engine
let migration_engine = MigrationEngine::new(config);
// Build migration
let migration = MigrationBuilder::new()
.table("users")
.add_column(ColumnDefinition {
name: "phone".to_string(),
data_type: DataType::VarChar(20),
nullable: true,
default: None,
})
.mode(MigrationMode::Online)
.chunk_size(10_000)
.throttle_io_percent(50)
.enable_shadow_traffic(true)
.build()?;
// Execute migration asynchronously
let migration_id = migration_engine.start(migration).await?;
// Monitor progress
let progress = migration_engine.get_progress(&migration_id).await?;
println!("Progress: {}%", progress.percent_complete);
// Pause if needed
migration_engine.pause(&migration_id).await?;
// Resume
migration_engine.resume(&migration_id).await?;
// Rollback
migration_engine.rollback(&migration_id).await?;

7.3 REST API

Terminal window
# Start migration
curl -X POST http://localhost:8080/api/migrations \
-H "Content-Type: application/json" \
-d '{
"table": "users",
"operation": {
"type": "ADD_COLUMN",
"column": {
"name": "phone",
"type": "VARCHAR(20)",
"nullable": true
}
},
"mode": "ONLINE",
"options": {
"chunk_size": 10000,
"throttle_io_percent": 50,
"enable_shadow_traffic": true
}
}'
# Response:
# {
# "migration_id": "migration-uuid-1234",
# "status": "PLANNING",
# "estimated_duration_seconds": 450
# }
# Get migration status
curl http://localhost:8080/api/migrations/migration-uuid-1234
# Pause migration
curl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/pause
# Resume migration
curl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/resume
# Rollback migration
curl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/rollback

8. Performance Analysis

8.1 Overhead During Migration

PhaseI/O OverheadCPU OverheadLatency ImpactNotes
Planning0%<1%0msMetadata-only
Ghost Table Creation~5%~2%0msOne-time
Trigger Installation0%0%+0.1ms/writeSmall overhead
Bulk Copy10-50%*5-15%0msConfigurable throttle
Synchronization5-10%2-5%+0.2ms/writeChangelog replication
Validation20-40%10-20%0msBackground process
Cutover0%<1%5-10msBrief lock

*Throttle configurable via THROTTLE_IO parameter

8.2 Migration Duration Estimates

Assumptions:

  • Table size: 100M rows
  • Row size: 1KB average
  • Total data: 100GB
  • I/O capacity: 100MB/s
  • Throttle: 50% (50MB/s allocated to migration)
PhaseDurationCalculation
Planning<1 secondMetadata operations
Ghost Table Creation~1 secondDDL operation
Trigger Installation<1 secondDDL operation
Bulk Copy~33 minutes100GB ÷ 50MB/s = 2000s ≈ 33 min
Synchronization~5 minutesCatch up on incremental changes
Validation~10 minutesChecksum + sampling
Cutover<10 secondsRename + apply final changelog
Total~50 minutesEnd-to-end

Scaling:

  • 10M rows: ~5 minutes
  • 1B rows: ~8 hours
  • 10B rows: ~3 days (consider partitioned migration)

8.3 Optimization Techniques

8.3.1 Adaptive Throttling

pub struct AdaptiveThrottle {
target_io_percent: f64,
current_io_usage: f64,
adjustment_factor: f64,
}
impl AdaptiveThrottle {
pub fn adjust_chunk_size(&mut self, current_chunk_size: usize) -> usize {
// Measure current I/O usage
self.current_io_usage = self.measure_io_usage();
// Adjust chunk size based on I/O headroom
let io_headroom = self.target_io_percent - self.current_io_usage;
if io_headroom > 10.0 {
// Plenty of headroom, increase chunk size
(current_chunk_size as f64 * 1.2) as usize
} else if io_headroom < -5.0 {
// Over budget, decrease chunk size
(current_chunk_size as f64 * 0.8) as usize
} else {
// Within acceptable range
current_chunk_size
}
}
}

8.3.2 Parallel Chunk Processing

For very large tables, process multiple chunks in parallel:

pub async fn parallel_copy(
&self,
table: &str,
ghost_table: &str,
total_rows: u64,
chunk_size: usize,
parallelism: usize,
) -> Result<()> {
let num_chunks = (total_rows as usize + chunk_size - 1) / chunk_size;
// Create channel for chunk IDs
let (tx, rx) = mpsc::channel(num_chunks);
// Send all chunk IDs
for chunk_id in 0..num_chunks {
tx.send(chunk_id).await?;
}
drop(tx);
// Spawn worker tasks
let mut handles = Vec::new();
for worker_id in 0..parallelism {
let rx = rx.clone();
let table = table.to_string();
let ghost_table = ghost_table.to_string();
let handle = tokio::spawn(async move {
while let Ok(chunk_id) = rx.recv().await {
let offset = chunk_id * chunk_size;
copy_chunk(&table, &ghost_table, offset, chunk_size).await?;
}
Ok::<(), Error>(())
});
handles.push(handle);
}
// Wait for all workers
for handle in handles {
handle.await??;
}
Ok(())
}

Performance: 2-4x faster for large tables (I/O bound)

8.3.3 Compression During Copy

// Compress data during copy to reduce I/O
INSERT INTO _helios_ghost_users_20251109
SELECT
id,
name,
COMPRESS(large_text_field) as large_text_field
FROM users
WHERE id BETWEEN ? AND ?;

Benefit: 30-50% faster for tables with compressible data


9. Failure Modes & Recovery

9.1 Failure Scenarios

FailurePhaseImpactRecovery
Process CrashAnyMigration pausedResume from last checkpoint
Disk FullCopyingMigration pausedFree disk space, resume
Network PartitionCopyingPartial data lossResync from changelog
Validation FailureValidatingMigration failedRollback, investigate
Cutover TimeoutCutoverPartial cutoverEmergency rollback
Trigger ErrorSynchronizingData inconsistencyPause, resync, resume

9.2 Recovery Strategies

9.2.1 Checkpoint-Based Resume

pub struct MigrationCheckpoint {
migration_id: String,
last_copied_pk: i64, // Last primary key copied
last_applied_lsn: i64, // Last LSN applied from changelog
rows_copied: u64,
timestamp: DateTime<Utc>,
}
impl MigrationEngine {
pub async fn resume_from_checkpoint(&self, migration_id: &str) -> Result<()> {
// Load checkpoint
let checkpoint = self.load_checkpoint(migration_id).await?;
// Resume from last position
match self.get_migration_state(migration_id).await? {
MigrationState::Copying => {
// Resume bulk copy from last_copied_pk
self.resume_bulk_copy(migration_id, checkpoint.last_copied_pk).await?;
}
MigrationState::Synchronizing => {
// Resume changelog application from last_applied_lsn
self.resume_changelog_sync(migration_id, checkpoint.last_applied_lsn).await?;
}
_ => {
return Err(Error::InvalidState);
}
}
Ok(())
}
}

9.2.2 Automatic Retry with Exponential Backoff

pub async fn copy_chunk_with_retry(
&self,
table: &str,
ghost_table: &str,
offset: u64,
chunk_size: usize,
) -> Result<()> {
let mut retries = 0;
let max_retries = 5;
loop {
match self.copy_chunk(table, ghost_table, offset, chunk_size).await {
Ok(_) => return Ok(()),
Err(e) if retries < max_retries => {
let backoff = Duration::from_secs(2_u64.pow(retries));
warn!("Chunk copy failed, retrying in {:?}: {}", backoff, e);
tokio::time::sleep(backoff).await;
retries += 1;
}
Err(e) => {
error!("Chunk copy failed after {} retries: {}", retries, e);
return Err(e);
}
}
}
}

9.2.3 Health Checks

pub struct MigrationHealthChecker {
migration_id: String,
}
impl MigrationHealthChecker {
pub async fn run_health_check(&self) -> HealthStatus {
let mut issues = Vec::new();
// Check disk space
if self.available_disk_space().await < self.estimated_space_needed().await {
issues.push(HealthIssue::InsufficientDisk);
}
// Check trigger status
if !self.triggers_active().await {
issues.push(HealthIssue::TriggersInactive);
}
// Check changelog lag
let lag = self.changelog_lag().await;
if lag > Duration::from_secs(300) {
issues.push(HealthIssue::ChangelogLagging);
}
// Check data consistency
if let Some(drift) = self.detect_data_drift().await {
issues.push(HealthIssue::DataDrift(drift));
}
if issues.is_empty() {
HealthStatus::Healthy
} else {
HealthStatus::Unhealthy(issues)
}
}
}

10. Integration with HeliosDB

10.1 Schema Management Integration

heliosdb-metadata/src/schema_service.rs
impl SchemaService {
/// Apply schema change with zero-downtime migration
pub async fn apply_schema_change(
&self,
table: &str,
change: SchemaChange,
options: MigrationOptions,
) -> Result<MigrationId> {
// Check if online migration is needed
let requires_migration = self.requires_migration(&change)?;
if requires_migration {
// Use zero-downtime migration
let migration_engine = self.get_migration_engine();
let migration = MigrationBuilder::new()
.table(table)
.change(change)
.options(options)
.build()?;
migration_engine.start(migration).await
} else {
// Fast path: metadata-only change
self.apply_metadata_change(table, change).await
}
}
}

10.2 Transaction Manager Integration

heliosdb-mvcc/src/transaction_manager.rs
impl TransactionManager {
/// Route query to appropriate table during migration
pub async fn execute_query(
&self,
query: &Query,
) -> Result<QueryResult> {
let table = query.table();
// Check if migration in progress
if let Some(migration) = self.get_active_migration(table).await? {
match migration.state {
MigrationState::Copying |
MigrationState::Synchronizing |
MigrationState::Validating => {
// Route to original table
self.execute_on_table(table, query).await
}
MigrationState::Cutover => {
// Brief lock during cutover
// Wait for cutover to complete
self.wait_for_cutover(&migration).await?;
self.execute_on_table(table, query).await
}
_ => {
self.execute_on_table(table, query).await
}
}
} else {
// No migration, execute normally
self.execute_on_table(table, query).await
}
}
}

10.3 Storage Engine Integration

heliosdb-storage/src/lsm_storage.rs
impl LsmStorage {
/// Support for ghost table operations
pub async fn create_ghost_table(
&self,
original_table: &str,
new_schema: &Schema,
) -> Result<String> {
let ghost_table_name = format!(
"_helios_ghost_{}_{}",
original_table,
Utc::now().timestamp()
);
// Create new LSM tree with new schema
self.create_table(&ghost_table_name, new_schema).await?;
Ok(ghost_table_name)
}
/// Atomic table rename for cutover
pub async fn atomic_rename(
&self,
old_name: &str,
new_name: &str,
) -> Result<()> {
// Use storage-level atomic rename
// This is a metadata operation only
self.metadata_store.rename_table(old_name, new_name).await
}
}

10.4 Multi-Model Support

The migration system works across all HeliosDB data models:

10.4.1 Document Model

heliosdb-document/src/schema.rs
// Migrate document collection schema
ALTER COLLECTION users ADD FIELD phone STRING NULL;
// Internally uses ghost collection approach
// - Create ghost collection with new schema
// - Copy documents with new field (set to null)
// - Sync incremental changes
// - Atomic cutover

10.4.2 Graph Model

heliosdb-graph/src/schema.rs
// Migrate vertex schema
ALTER VERTEX User ADD PROPERTY phone STRING NULL;
// Ghost vertex table approach
// - Create ghost vertex table
// - Copy vertices with new property
// - Preserve edges (reference by ID)
// - Atomic cutover

10.4.3 Time-Series Model

heliosdb-timeseries/src/schema.rs
// Migrate time-series metric schema
ALTER METRIC cpu_usage ADD TAG hostname STRING;
// Special handling for time-series:
// - Only apply to new data (forward-only)
// - Historical data remains unchanged
// - No ghost table needed

11. Edge Cases & Solutions

11.1 Large Tables (>1TB)

Challenge: Single migration takes too long (days)

Solution: Partitioned Migration

// Migrate table partition by partition
for partition in table.partitions() {
let migration = MigrationBuilder::new()
.table(table.name())
.partition(partition.id())
.change(change.clone())
.build()?;
migration_engine.start(migration).await?;
}
// Cutover all partitions atomically
migration_engine.cutover_all(table.name()).await?;

11.2 Complex Constraints

Challenge: Foreign key constraints between tables

Solution: Constraint-Aware Migration

// Detect foreign key constraints
let constraints = self.get_foreign_keys(table)?;
for constraint in constraints {
// Temporarily disable foreign key checks during migration
self.disable_constraint(&constraint)?;
}
// Perform migration
migration_engine.start(migration).await?;
// Re-enable and validate constraints
for constraint in constraints {
self.enable_constraint(&constraint)?;
self.validate_constraint(&constraint)?;
}

11.3 Hot Partitions

Challenge: Some partitions have very high write rate

Solution: Intelligent Scheduling

pub struct PartitionMigrationScheduler {
partitions: Vec<Partition>,
}
impl PartitionMigrationScheduler {
pub fn schedule(&self) -> Vec<MigrationTask> {
// Measure write rate for each partition
let write_rates = self.measure_write_rates();
// Schedule migrations during low-traffic periods
let mut tasks = Vec::new();
for partition in &self.partitions {
let low_traffic_window = self.find_low_traffic_window(partition);
tasks.push(MigrationTask {
partition: partition.clone(),
scheduled_start: low_traffic_window.start,
estimated_duration: self.estimate_duration(partition),
});
}
tasks
}
}

11.4 Schema Evolution Conflicts

Challenge: Multiple migrations on same table

Solution: Migration Queue

pub struct MigrationQueue {
pending: VecDeque<Migration>,
active: Option<Migration>,
}
impl MigrationQueue {
pub async fn enqueue(&mut self, migration: Migration) -> Result<()> {
// Check for conflicts
if let Some(active) = &self.active {
if active.table == migration.table {
// Queue migration for later
self.pending.push_back(migration);
return Ok(());
}
}
// No conflict, start immediately
self.start(migration).await
}
pub async fn on_migration_complete(&mut self, migration_id: &str) -> Result<()> {
self.active = None;
// Start next queued migration
if let Some(next) = self.pending.pop_front() {
self.start(next).await?;
}
Ok(())
}
}

12. Testing Strategy

12.1 Unit Tests

#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn test_ghost_table_creation() {
let engine = MigrationEngine::new(config);
let ghost_table = engine.create_ghost_table("users", new_schema).await.unwrap();
assert!(ghost_table.starts_with("_helios_ghost_users_"));
}
#[tokio::test]
async fn test_chunked_copy() {
let engine = MigrationEngine::new(config);
engine.copy_chunk("users", ghost_table, 0, 10_000).await.unwrap();
let count = query_one::<i64>(&format!("SELECT COUNT(*) FROM {}", ghost_table)).await.unwrap();
assert_eq!(count, 10_000);
}
#[tokio::test]
async fn test_rollback_early_phase() {
let engine = MigrationEngine::new(config);
let migration_id = engine.start(migration).await.unwrap();
engine.rollback(&migration_id).await.unwrap();
// Verify ghost table is dropped
assert!(!table_exists(&ghost_table_name));
}
}

12.2 Integration Tests

#[tokio::test]
async fn test_end_to_end_migration() {
// Setup: Create table with 1M rows
let db = TestDatabase::new().await;
db.execute("CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(255))").await.unwrap();
db.insert_rows("users", 1_000_000).await.unwrap();
// Start migration: Add column
let migration_id = db.migrate(
"ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL"
).await.unwrap();
// Wait for completion
db.wait_for_migration(&migration_id).await.unwrap();
// Verify new schema
let schema = db.get_schema("users").await.unwrap();
assert!(schema.has_column("email"));
// Verify data integrity
let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap();
assert_eq!(count, 1_000_000);
}
#[tokio::test]
async fn test_migration_with_live_traffic() {
let db = TestDatabase::new().await;
db.execute("CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(255))").await.unwrap();
db.insert_rows("users", 100_000).await.unwrap();
// Start migration
let migration_id = db.migrate(
"ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL"
).await.unwrap();
// Simulate live traffic during migration
let write_handle = tokio::spawn(async move {
for i in 100_000..110_000 {
db.execute(&format!(
"INSERT INTO users (id, name) VALUES ({}, 'user_{}')", i, i
)).await.unwrap();
tokio::time::sleep(Duration::from_millis(10)).await;
}
});
// Wait for both migration and writes
db.wait_for_migration(&migration_id).await.unwrap();
write_handle.await.unwrap();
// Verify all writes are present
let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap();
assert_eq!(count, 110_000);
}

12.3 Chaos Engineering Tests

#[tokio::test]
async fn test_migration_resilience_to_failures() {
let db = TestDatabase::new().await;
db.insert_rows("users", 1_000_000).await.unwrap();
// Start migration
let migration_id = db.migrate(
"ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL"
).await.unwrap();
// Wait until 50% complete
db.wait_for_progress(&migration_id, 50).await.unwrap();
// Simulate process crash
db.crash_and_restart().await;
// Resume migration
db.resume_migration(&migration_id).await.unwrap();
// Verify completion
db.wait_for_migration(&migration_id).await.unwrap();
let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap();
assert_eq!(count, 1_000_000);
}
#[tokio::test]
async fn test_rollback_during_cutover() {
let db = TestDatabase::new().await;
db.insert_rows("users", 100_000).await.unwrap();
let migration_id = db.migrate(
"ALTER TABLE users MODIFY COLUMN id BIGINT"
).await.unwrap();
// Wait for cutover
db.wait_for_state(&migration_id, MigrationState::Cutover).await.unwrap();
// Trigger emergency rollback
db.rollback_migration(&migration_id).await.unwrap();
// Verify original schema restored
let schema = db.get_schema("users").await.unwrap();
assert_eq!(schema.get_column("id").data_type, DataType::Int);
}

13. Monitoring & Observability

13.1 Metrics

pub struct MigrationMetrics {
// Progress metrics
pub rows_copied: Counter,
pub rows_remaining: Gauge,
pub percent_complete: Gauge,
// Performance metrics
pub copy_throughput_rows_per_sec: Gauge,
pub copy_throughput_bytes_per_sec: Gauge,
pub io_wait_time_ms: Histogram,
// Resource metrics
pub disk_space_used_bytes: Gauge,
pub cpu_usage_percent: Gauge,
pub memory_usage_bytes: Gauge,
// Reliability metrics
pub retries: Counter,
pub errors: Counter,
pub checkpoint_saves: Counter,
// Validation metrics
pub validation_mismatches: Counter,
pub checksum_failures: Counter,
}
impl MigrationMetrics {
pub fn export_prometheus(&self) -> String {
// Export metrics in Prometheus format
format!(
"# HELP heliosdb_migration_progress Migration progress percentage\n\
# TYPE heliosdb_migration_progress gauge\n\
heliosdb_migration_progress{{migration_id=\"{}\"}} {}\n\
\n\
# HELP heliosdb_migration_throughput Copy throughput in rows/sec\n\
# TYPE heliosdb_migration_throughput gauge\n\
heliosdb_migration_throughput{{migration_id=\"{}\"}} {}\n",
self.migration_id,
self.percent_complete.get(),
self.migration_id,
self.copy_throughput_rows_per_sec.get()
)
}
}

13.2 Logging

// Structured logging for migrations
info!(
migration_id = %migration_id,
table = %table,
operation = %operation,
state = ?state,
progress_percent = progress,
"Migration progress update"
);
warn!(
migration_id = %migration_id,
chunk_id = chunk_id,
error = %err,
"Chunk copy failed, retrying"
);
error!(
migration_id = %migration_id,
reason = %reason,
"Migration failed, initiating rollback"
);

13.3 Dashboards

Recommended Grafana Dashboard Panels:

  1. Migration Progress

    • % Complete gauge
    • Estimated time remaining
    • Rows copied / total rows
  2. Throughput

    • Rows/sec line chart
    • MB/sec line chart
  3. Resource Usage

    • CPU % line chart
    • Memory usage line chart
    • Disk I/O line chart
  4. Health

    • Error rate counter
    • Retry rate counter
    • Validation failures counter

14. Future Enhancements

14.1 AI-Driven Migration Optimization

Vision: Use ML to predict optimal migration parameters

pub struct AIOptimizer {
model: MigrationOptimizationModel,
}
impl AIOptimizer {
pub fn predict_optimal_params(
&self,
table_stats: &TableStats,
workload_pattern: &WorkloadPattern,
) -> MigrationParams {
// ML model predicts:
// - Optimal chunk size
// - Best time to start migration
// - Expected duration
// - Risk of performance impact
let features = self.extract_features(table_stats, workload_pattern);
let prediction = self.model.predict(features);
MigrationParams {
chunk_size: prediction.chunk_size,
throttle_io_percent: prediction.throttle,
recommended_start_time: prediction.best_start_time,
estimated_duration: prediction.duration,
}
}
}

14.2 Cross-Region Migration

Vision: Migrate schemas across geographically distributed clusters

pub struct CrossRegionMigration {
source_region: Region,
target_regions: Vec<Region>,
}
impl CrossRegionMigration {
pub async fn execute(&self, migration: Migration) -> Result<()> {
// 1. Migrate in source region first
let source_migration_id = self.migrate_region(
&self.source_region,
migration.clone()
).await?;
// 2. Replicate to target regions
for region in &self.target_regions {
self.replicate_migration(region, &source_migration_id).await?;
}
// 3. Coordinated cutover across all regions
self.global_cutover(migration.table()).await?;
Ok(())
}
}

14.3 Blue-Green Schema Deployment

Vision: Deploy schema changes with instant rollback capability

// Deploy new schema as "green" version
let green_schema = db.deploy_schema_version("users", "v2", new_schema).await?;
// Route 10% of traffic to green
db.route_traffic("users", blue=90, green=10).await?;
// Monitor for issues
if green_schema.has_errors() {
// Instant rollback
db.route_traffic("users", blue=100, green=0).await?;
} else {
// Gradually shift traffic
db.route_traffic("users", blue=0, green=100).await?;
}

15. Conclusion

The Zero-Downtime Migration Architecture enables HeliosDB to perform schema changes on live production systems without downtime, data loss, or performance degradation. Key advantages:

15.1 Benefits

  1. Zero Downtime: Applications continue operating during migrations
  2. Zero Data Loss: Transaction-consistent migration with validation
  3. Rollback Safety: One-click rollback with multiple safety checkpoints
  4. Performance Control: Throttling prevents overwhelming production systems
  5. Multi-Model Support: Works across relational, document, graph, and time-series models
  6. Distributed Ready: Designed for multi-region, multi-tenant deployments

15.2 Competitive Advantage

  • vs gh-ost: Native distributed support, multi-model, AI optimization
  • vs pt-online-schema-change: Better rollback, shadow traffic validation
  • vs AWS DMS: Zero-downtime for schema changes (not just data migration)
  • vs Liquibase/Flyway: Online execution, not just versioning

15.3 Production Readiness

This architecture is designed for Phase 2 implementation with clear deliverables:

  • Month 1: Core migration engine (ghost table, triggers, copy)
  • Month 2: Validation framework, rollback mechanisms
  • Month 3: Performance optimization, testing, documentation

Estimated Development: 3 months, 2 senior engineers Investment: $450K ROI: Essential for enterprise adoption, enables $50M+ ARR


Appendix A: Core Implementation Skeleton

heliosdb-migration/src/lib.rs
pub struct MigrationEngine {
state_store: Arc<MigrationStateStore>,
ghost_table_manager: Arc<GhostTableManager>,
copy_engine: Arc<CopyEngine>,
trigger_manager: Arc<TriggerManager>,
validator: Arc<ValidatorEngine>,
orchestrator: Arc<MigrationOrchestrator>,
}
impl MigrationEngine {
pub async fn start(&self, migration: Migration) -> Result<MigrationId> {
// 1. Validate migration
self.validate(&migration).await?;
// 2. Create migration record
let migration_id = self.state_store.create(migration).await?;
// 3. Start orchestration
self.orchestrator.execute(migration_id).await?;
Ok(migration_id)
}
pub async fn get_progress(&self, migration_id: &MigrationId) -> Result<Progress> {
self.state_store.get_progress(migration_id).await
}
pub async fn rollback(&self, migration_id: &MigrationId) -> Result<()> {
self.orchestrator.rollback(migration_id).await
}
}
pub struct MigrationOrchestrator {
state_machine: Arc<StateMachine>,
}
impl MigrationOrchestrator {
pub async fn execute(&self, migration_id: MigrationId) -> Result<()> {
loop {
let state = self.state_machine.current_state(&migration_id).await?;
match state {
State::Planning => self.plan(&migration_id).await?,
State::Preparing => self.prepare(&migration_id).await?,
State::Copying => self.copy(&migration_id).await?,
State::Synchronizing => self.synchronize(&migration_id).await?,
State::Validating => self.validate(&migration_id).await?,
State::ReadyForCutover => break,
State::Failed => return Err(Error::MigrationFailed),
_ => return Err(Error::InvalidState),
}
}
// Wait for manual cutover or auto-cutover
Ok(())
}
}

End of Document

Next Steps:

  1. Review with engineering team
  2. Prototype core components (ghost table, triggers)
  3. Define detailed API specifications
  4. Create implementation plan for Phase 2