Zero-Downtime Schema Migration Architecture
Zero-Downtime Schema Migration Architecture
Document Version: 1.0 Created: November 9, 2025 Status: Phase 2 Architecture Design Author: System Architecture Designer
Executive Summary
This document defines the complete architecture for HeliosDB’s zero-downtime schema migration system, enabling online DDL operations during live traffic without table locks or application downtime. The system is inspired by proven tools like GitHub’s gh-ost and Percona’s pt-online-schema-change, but extends them with distributed capabilities, AI-driven optimization, and native integration with HeliosDB’s multi-model architecture.
Key Capabilities
- Online DDL: ADD/DROP/MODIFY COLUMN, ADD/DROP INDEX without blocking operations
- Ghost Table Approach: Shadow table creation with trigger-based replication
- Atomic Cutover: Millisecond switchover with automatic rollback
- Zero Data Loss: Transaction-consistent migration with validation
- Performance Monitoring: Real-time overhead tracking and throttling
- Multi-Model Support: Works across document, graph, time-series, and relational models
1. System Architecture
1.1 High-Level Architecture
┌─────────────────────────────────────────────────────────────────────┐│ Application Layer ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ SQL Client │ │ GraphQL API │ │ REST API │ ││ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ ││ │ │ │ ││ └─────────────────┴─────────────────┘ ││ ▼ ││ ┌─────────────────────────────────────────────┐ ││ │ Migration Coordinator (Router) │ ││ │ • Route to original or ghost table │ ││ │ • Handle dual-write during migration │ ││ └───────────────┬─────────────────────────────┘ ││ │ │└─────────────────────────┼───────────────────────────────────────────┘ │┌─────────────────────────┼───────────────────────────────────────────┐│ Zero-Downtime Migration Engine ││ │ ││ ┌──────────────────────▼──────────────────────────┐ ││ │ Migration Orchestrator │ ││ │ • State machine management │ ││ │ • Phase transitions │ ││ │ • Rollback coordination │ ││ └──┬────────────────┬────────────────┬────────────┘ ││ │ │ │ ││ ┌──▼────────┐ ┌───▼──────┐ ┌─────▼──────┐ ┌─────────────┐ ││ │ Ghost │ │ Copy │ │ Trigger │ │ Validator │ ││ │ Table │ │ Engine │ │ Manager │ │ Engine │ ││ │ Manager │ │ │ │ │ │ │ ││ └───────────┘ └──────────┘ └────────────┘ └─────────────┘ ││ ││ ┌──────────────────────────────────────────────────────┐ ││ │ Migration State Store │ ││ │ • Migration metadata (state, progress, metrics) │ ││ │ • Cutover checkpoints │ ││ │ • Rollback snapshots │ ││ └──────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────────┘ │┌─────────────────────────┼───────────────────────────────────────────┐│ Storage Layer ││ │ ││ ┌──────────────────────▼──────────────────────────┐ ││ │ │ ││ │ ┌──────────────┐ ┌──────────────┐ │ ││ │ │ Original │ │ Ghost │ │ ││ │ │ Table │ │ Table │ │ ││ │ │ │ │ (_new_xxx) │ │ ││ │ └──────────────┘ └──────────────┘ │ ││ │ │ ││ │ ┌──────────────────────────────────────┐ │ ││ │ │ Trigger Layer (Dual Write) │ │ ││ │ │ • INSERT/UPDATE/DELETE capture │ │ ││ │ │ • Async replication to ghost │ │ ││ │ └──────────────────────────────────────┘ │ ││ │ │ ││ │ HeliosDB Storage Engine │ ││ │ (LSM Tree, WAL, MVCC, Transaction Manager) │ ││ └──────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────────────┘1.2 Component Breakdown
1.2.1 Migration Coordinator
- Purpose: Routes queries to appropriate table during migration
- Responsibilities:
- Query routing (original vs ghost table)
- Dual-write coordination
- Read/write splitting during cutover
- Performance monitoring and throttling
1.2.2 Migration Orchestrator
- Purpose: Manages migration lifecycle and state transitions
- Responsibilities:
- State machine execution
- Phase orchestration (Planning → Executing → Validating → Cutover)
- Error handling and rollback coordination
- Progress tracking and reporting
1.2.3 Ghost Table Manager
- Purpose: Creates and manages shadow tables
- Responsibilities:
- Ghost table creation with new schema
- Naming convention (
_helios_ghost_{table_name}_{timestamp}) - Cleanup after successful cutover
- Preservation for rollback
1.2.4 Copy Engine
- Purpose: Efficiently copy data from original to ghost table
- Responsibilities:
- Chunked, resumable bulk copy
- Throttling to avoid overwhelming storage
- Progress tracking (rows copied, estimated time remaining)
- Delta detection for incremental catch-up
1.2.5 Trigger Manager
- Purpose: Synchronize ongoing changes during migration
- Responsibilities:
- Install INSERT/UPDATE/DELETE triggers on original table
- Capture changes to changelog table
- Async replication to ghost table
- Conflict resolution (last-write-wins)
1.2.6 Validator Engine
- Purpose: Ensure data consistency between original and ghost
- Responsibilities:
- Row count validation
- Checksum comparison (chunk-wise)
- Sample-based deep comparison
- Index validation
1.2.7 Migration State Store
- Purpose: Persistent storage for migration metadata
- Responsibilities:
- Migration state persistence
- Progress checkpoints
- Rollback snapshots
- Audit trail
2. Migration Lifecycle State Machine
2.1 State Diagram
┌──────────────┐ │ CREATED │ └──────┬───────┘ │ start_migration() ▼ ┌──────────────┐ │ PLANNING │──────────► FAILED (validation error) └──────┬───────┘ │ validate() ▼ ┌──────────────┐ │ PREPARING │ │ Ghost Table │ └──────┬───────┘ │ create_ghost_table() ▼ ┌──────────────┐ ┌───────────►│ COPYING │──────────► PAUSED │ │ Bulk Data │◄──────────┘ (resume_migration) │ └──────┬───────┘ │ │ bulk_copy_complete() │ ▼ │ ┌──────────────┐ │ │SYNCHRONIZING │ │ │ Incremental │ │ └──────┬───────┘ │ │ catch_up_complete() │ ▼ │ ┌──────────────┐ │ │ VALIDATING │──────────► FAILED (validation failed) │ └──────┬───────┘ │ │ │ validation_passed() │ │ ▼ │ │ ┌──────────────┐ │ │ │ READY FOR │ │ │ │ CUTOVER │ │ │ └──────┬───────┘ │ │ │ cutover() │ │ ▼ │ │ ┌──────────────┐ │ │ ┌───┤ CUTOVER │ │ │ │ │ In Progress │ │ │ │ └──────┬───────┘ │ │ │ │ cutover_complete() │ │ │ ▼ │ │ │ ┌──────────────┐ │ │ │ │ COMPLETED │ │ │ │ └──────────────┘ │ │ │ │ │ │ │ │ rollback() rollback() │ │ │ │ ▼ ▼ │ ┌──────────────┐ ┌──────────────┐ └────┤ ROLLING │◄─────────────┤ FAILED │ │ BACK │ └──────────────┘ └──────┬───────┘ │ rollback_complete() ▼ ┌──────────────┐ │ ROLLED BACK │ └──────────────┘2.2 State Transitions
| Current State | Event | Next State | Reversible |
|---|---|---|---|
| CREATED | start_migration() | PLANNING | Yes |
| PLANNING | validate() | PREPARING | Yes |
| PREPARING | create_ghost_table() | COPYING | Yes |
| COPYING | bulk_copy_complete() | SYNCHRONIZING | ❌ No* |
| COPYING | pause_migration() | PAUSED | Yes |
| PAUSED | resume_migration() | COPYING | Yes |
| SYNCHRONIZING | catch_up_complete() | VALIDATING | ❌ No* |
| VALIDATING | validation_passed() | READY FOR CUTOVER | Yes |
| READY FOR CUTOVER | cutover() | CUTOVER | Yes (for 5 min) |
| CUTOVER | cutover_complete() | COMPLETED | ❌ No |
| ANY | rollback() | ROLLING BACK | - |
| ROLLING BACK | rollback_complete() | ROLLED BACK | ❌ No |
*Can rollback but will lose incremental copy progress
2.3 Rollback Points
The system supports rollback at different stages with varying complexity:
-
CREATED → PLANNING → PREPARING: Instant rollback
- No data copied, no triggers installed
- Simply drop ghost table (if created)
-
COPYING → SYNCHRONIZING: Fast rollback
- Drop triggers
- Drop ghost table
- Update migration state
- Estimated time: <10 seconds
-
VALIDATING → READY FOR CUTOVER: Fast rollback
- Same as above
- Validation data already discarded
-
CUTOVER (first 5 minutes): Emergency rollback
- Requires table rename swap back
- Reapply missed writes from changelog
- Estimated time: <30 seconds
- Hard limit: 5 minutes, after that rollback is NOT safe
-
COMPLETED: No rollback
- Original table dropped
- Must create new migration to revert schema
3. Ghost Table Approach
3.1 Ghost Table Lifecycle
The ghost table approach is inspired by GitHub’s gh-ost but adapted for HeliosDB’s distributed architecture:
Step 1: Create Ghost Table┌─────────────────────────────────────┐│ Original Table: users ││ ┌─────────┬──────────┬──────────┐ ││ │ id (PK) │ name │ email │ ││ ├─────────┼──────────┼──────────┤ ││ │ 1 │ Alice │ a@ex.com │ ││ │ 2 │ Bob │ b@ex.com │ ││ └─────────┴──────────┴──────────┘ │└─────────────────────────────────────┘ │ │ CREATE GHOST TABLE ▼┌─────────────────────────────────────────────┐│ Ghost Table: _helios_ghost_users_20251109 ││ ┌─────────┬──────────┬──────────┬────────┐ ││ │ id (PK) │ name │ email │ phone │ │ ← New column│ ├─────────┼──────────┼──────────┼────────┤ ││ │ (empty) │ │ │ │ ││ └─────────┴──────────┴──────────┴────────┘ │└─────────────────────────────────────────────┘
Step 2: Install Triggers (Dual Write)┌─────────────────────────────────────┐│ Original Table: users ││ ┌─────────────────────────────┐ ││ │ Triggers Installed: │ ││ │ • INSERT → Changelog │ ││ │ • UPDATE → Changelog │ ││ │ • DELETE → Changelog │ ││ └─────────────────────────────┘ ││ ┌─────────┬──────────┬──────────┐ ││ │ id (PK) │ name │ email │ ││ ├─────────┼──────────┼──────────┤ ││ │ 1 │ Alice │ a@ex.com │ ││ │ 2 │ Bob │ b@ex.com │ ││ └─────────┴──────────┴──────────┘ │└─────────────┬───────────────────────┘ │ │ Async Replication ▼┌──────────────────────────────────────┐│ Changelog Table ││ ┌────────┬─────┬─────────────────┐ ││ │ lsn │ op │ data │ ││ ├────────┼─────┼─────────────────┤ ││ │ 100001 │ INS │ {id:3,name:...} │ ││ │ 100002 │ UPD │ {id:1,email:..} │ ││ └────────┴─────┴─────────────────┘ │└──────────────────────────────────────┘
Step 3: Bulk Copy (Chunked)┌─────────────────────────────────────┐│ Original Table: users (100M rows) ││ ┌─────────┬──────────┬──────────┐ ││ │ id (PK) │ name │ email │ ││ └─────────┴──────────┴──────────┘ │└─────────────┬───────────────────────┘ │ │ Chunked Copy (10K rows/chunk) │ Throttled (max 50% I/O) ▼┌─────────────────────────────────────────────┐│ Ghost Table: _helios_ghost_users_20251109 ││ ┌─────────┬──────────┬──────────┬────────┐ ││ │ id (PK) │ name │ email │ phone │ ││ ├─────────┼──────────┼──────────┼────────┤ ││ │ 1 │ Alice │ a@ex.com │ NULL │ ││ │ 2 │ Bob │ b@ex.com │ NULL │ ││ │ ... │ ... │ ... │ NULL │ ││ │ (99.9% complete) │ ││ └─────────┴──────────┴──────────┴────────┘ │└─────────────────────────────────────────────┘
Step 4: Catch-up (Apply Changelog)┌──────────────────────────────────────┐│ Changelog Table ││ Apply to ghost table in LSN order ││ ┌────────┬─────┬─────────────────┐ ││ │ lsn │ op │ data │ ││ ├────────┼─────┼─────────────────┤ ││ │ 100001 │ INS │ {id:3,name:...} │ │ ──┐│ │ 100002 │ UPD │ {id:1,email:..} │ │ ││ └────────┴─────┴─────────────────┘ │ │└──────────────────────────────────────┘ │ │ Apply ▼┌─────────────────────────────────────────────┐│ Ghost Table: _helios_ghost_users_20251109 ││ ┌─────────┬──────────┬──────────┬────────┐ ││ │ id (PK) │ name │ email │ phone │ ││ ├─────────┼──────────┼──────────┼────────┤ ││ │ 1 │ Alice │ new@... │ NULL │ │ ← Updated│ │ 2 │ Bob │ b@ex.com │ NULL │ ││ │ 3 │ Charlie │ c@ex.com │ NULL │ │ ← Inserted│ │ (100% in sync) │ ││ └─────────┴──────────┴──────────┴────────┘ │└─────────────────────────────────────────────┘
Step 5: Atomic Cutover (Rename Swap)BEGIN TRANSACTION; -- Lock both tables briefly (milliseconds) LOCK TABLE users, _helios_ghost_users_20251109;
-- Apply final changelog entries -- (should be <100 rows accumulated during lock)
-- Atomic rename ALTER TABLE users RENAME TO _helios_old_users_20251109; ALTER TABLE _helios_ghost_users_20251109 RENAME TO users;
-- Update metadata catalog UPDATE helios_migration_state SET status = 'COMPLETED';COMMIT;
Post-Cutover:┌─────────────────────────────────────────────┐│ New Table: users (with new schema) ││ ┌─────────┬──────────┬──────────┬────────┐ ││ │ id (PK) │ name │ email │ phone │ ││ ├─────────┼──────────┼──────────┼────────┤ ││ │ 1 │ Alice │ new@... │ NULL │ ││ │ 2 │ Bob │ b@ex.com │ NULL │ ││ │ 3 │ Charlie │ c@ex.com │ NULL │ ││ └─────────┴──────────┴──────────┴────────┘ │└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐│ Backup: _helios_old_users_20251109 ││ (kept for 24 hours for emergency rollback) │└─────────────────────────────────────────────┘3.2 Trigger-Based Synchronization
HeliosDB uses a changelog table approach to capture changes:
-- Changelog table structureCREATE TABLE _helios_changelog_{table_name}_{timestamp} ( lsn BIGINT PRIMARY KEY, -- Log Sequence Number operation VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE pk_value TEXT NOT NULL, -- Primary key of affected row data JSONB, -- Full row data (for INSERT/UPDATE) timestamp TIMESTAMP NOT NULL, -- When change occurred applied BOOLEAN DEFAULT false -- Has been applied to ghost);
-- Trigger on original table (pseudo-code)CREATE TRIGGER capture_changesAFTER INSERT OR UPDATE OR DELETE ON usersFOR EACH ROWBEGIN INSERT INTO _helios_changelog_users_20251109 ( lsn, operation, pk_value, data, timestamp ) VALUES ( next_lsn(), operation_type, NEW.id (or OLD.id for DELETE), row_to_json(NEW), NOW() );END;Replication Worker (async process):
- Polls changelog table for
applied = false - Applies changes to ghost table in LSN order
- Marks as
applied = true - Handles conflicts (last-write-wins based on timestamp)
- Throttles to avoid overwhelming storage
4. Migration Operations Support
4.1 Supported DDL Operations
| Operation | Complexity | Downtime | Data Copy | Notes |
|---|---|---|---|---|
| ADD COLUMN (nullable) | Low | 0ms | No | Metadata-only change |
| ADD COLUMN (NOT NULL) | Medium | ~ms | Yes | Requires default or backfill |
| DROP COLUMN | Low | ~ms | No | Metadata-only, data ignored |
| MODIFY COLUMN (type) | High | ~ms | Yes | Full table rewrite if incompatible |
| RENAME COLUMN | Low | 0ms | No | Metadata-only change |
| ADD INDEX | Medium | 0ms | No | Built concurrently on ghost |
| DROP INDEX | Low | 0ms | No | Metadata-only change |
| ADD PRIMARY KEY | High | ~ms | Yes | Full table rewrite |
| DROP PRIMARY KEY | High | ~ms | Yes | Full table rewrite |
| ADD FOREIGN KEY | Medium | ~ms | No | Validation required |
| DROP FOREIGN KEY | Low | 0ms | No | Metadata-only change |
| ADD UNIQUE CONSTRAINT | Medium | ~ms | No | Validation + index build |
| DROP UNIQUE CONSTRAINT | Low | 0ms | No | Metadata-only change |
4.2 Operation Implementation Details
4.2.1 ADD COLUMN (nullable)
Scenario: Add optional phone column to users table.
Strategy: Fast Path - Metadata Only
-- Original commandALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- HeliosDB internal executionBEGIN TRANSACTION; -- Update schema metadata UPDATE helios_catalog.columns SET schema_version = schema_version + 1 WHERE table_name = 'users';
INSERT INTO helios_catalog.columns ( table_name, column_name, data_type, nullable, default_value ) VALUES ( 'users', 'phone', 'VARCHAR(20)', true, NULL );
-- No data migration needed -- Existing rows will return NULL for phone columnCOMMIT;Overhead: <10ms (metadata write only)
4.2.2 ADD COLUMN (NOT NULL with default)
Scenario: Add mandatory created_at column with default value.
Strategy: Ghost Table Migration
-- Original commandALTER TABLE users ADD COLUMN created_at TIMESTAMP NOT NULL DEFAULT NOW();
-- HeliosDB internal execution-- Step 1: Create ghost table with new columnCREATE TABLE _helios_ghost_users_20251109 ASSELECT *, NOW() AS created_at FROM users WHERE 1=0;
-- Step 2: Install triggers for dual write-- Step 3: Bulk copy with default valueINSERT INTO _helios_ghost_users_20251109SELECT *, NOW() AS created_atFROM usersORDER BY idLIMIT 10000; -- Chunked
-- Step 4: Catch-up from changelog-- Step 5: Atomic cutoverOverhead: Proportional to table size, ~1-5 minutes for 100M rows
4.2.3 MODIFY COLUMN (type change)
Scenario: Change user_id from INT to BIGINT.
Strategy: Ghost Table with Type Casting
-- Original commandALTER TABLE users MODIFY COLUMN user_id BIGINT;
-- HeliosDB internal execution-- Create ghost table with new typeCREATE TABLE _helios_ghost_users_20251109 ( user_id BIGINT, -- Changed from INT name VARCHAR(255), email VARCHAR(255));
-- Copy with castingINSERT INTO _helios_ghost_users_20251109SELECT CAST(user_id AS BIGINT), name, emailFROM usersLIMIT 10000; -- ChunkedType Compatibility Matrix:
| From → To | Safe? | Strategy |
|---|---|---|
| INT → BIGINT | Yes | Direct cast |
| BIGINT → INT | ⚠ Warning | Cast with overflow check |
| VARCHAR(N) → VARCHAR(M) where M>N | Yes | Direct |
| VARCHAR(N) → VARCHAR(M) where M<N | ❌ No | Error if data > M |
| TIMESTAMP → DATE | ⚠ Warning | Truncation (time lost) |
| JSON → TEXT | Yes | Stringify |
| TEXT → JSON | ⚠ Warning | Parse (may fail) |
4.2.4 ADD INDEX
Scenario: Add index on email column.
Strategy: Concurrent Index Build
-- Original commandCREATE INDEX idx_users_email ON users(email);
-- HeliosDB internal execution-- Build index on ghost table while it's being populated-- No need for triggers since we're adding index, not changing dataCREATE INDEX CONCURRENTLY idx_users_emailON _helios_ghost_users_20251109(email);Overhead: Index build time (~1-10 min for 100M rows)
4.2.5 DROP COLUMN
Scenario: Remove deprecated legacy_field column.
Strategy: Ghost Table without Column
-- Original commandALTER TABLE users DROP COLUMN legacy_field;
-- HeliosDB internal executionCREATE TABLE _helios_ghost_users_20251109 ( user_id BIGINT, name VARCHAR(255), email VARCHAR(255) -- legacy_field omitted);
-- Copy only required columnsINSERT INTO _helios_ghost_users_20251109SELECT user_id, name, email -- Exclude legacy_fieldFROM usersLIMIT 10000;Overhead: Minimal (column ignored during copy)
5. Rollback Capabilities
5.1 One-Click Rollback Design
Philosophy: Rollback should be as easy as starting the migration.
Implementation:
// Simple rollback APImigration_engine.rollback(migration_id)?;
// Or via SQLROLLBACK MIGRATION 'migration-uuid-1234';5.2 Rollback Mechanisms by Phase
Phase 1-3 (PLANNING → COPYING): Drop Ghost Table
async fn rollback_early_phase(migration_id: &str) -> Result<()> { // 1. Drop triggers if installed drop_triggers(migration_id).await?;
// 2. Drop ghost table let ghost_table = format!("_helios_ghost_{}", migration_id); execute_sql(&format!("DROP TABLE IF EXISTS {}", ghost_table)).await?;
// 3. Drop changelog table let changelog_table = format!("_helios_changelog_{}", migration_id); execute_sql(&format!("DROP TABLE IF EXISTS {}", changelog_table)).await?;
// 4. Update migration state update_migration_state(migration_id, MigrationState::RolledBack).await?;
Ok(())}Time: <10 seconds Risk: None (no production data affected)
Phase 4 (VALIDATING → READY): Same as Phase 1-3
Phase 5 (CUTOVER - First 5 Minutes): Emergency Rename Swap
async fn rollback_cutover(migration_id: &str) -> Result<()> { // Check if within 5-minute window let cutover_time = get_cutover_timestamp(migration_id).await?; let elapsed = Utc::now() - cutover_time; if elapsed > Duration::minutes(5) { return Err(Error::RollbackWindowExpired); }
// Begin transaction let txn = begin_transaction().await?;
// 1. Lock both tables txn.execute("LOCK TABLE users, _helios_old_users_20251109").await?;
// 2. Swap names back txn.execute("ALTER TABLE users RENAME TO _helios_temp_broken").await?; txn.execute("ALTER TABLE _helios_old_users_20251109 RENAME TO users").await?; txn.execute("DROP TABLE _helios_temp_broken").await?;
// 3. Apply missed writes from changelog let changelog = get_cutover_changelog(migration_id).await?; for entry in changelog { apply_changelog_entry(&txn, &entry).await?; }
// 4. Update metadata txn.execute(&format!( "UPDATE helios_migration_state SET status = 'ROLLED_BACK' WHERE id = '{}'", migration_id )).await?;
txn.commit().await?;
Ok(())}Time: 10-30 seconds Risk: Low (within 5-minute window) Data Loss: None (changelog replay ensures consistency)
Phase 6 (COMPLETED): No Rollback
Once migration is marked COMPLETED and old table is dropped, rollback is NOT possible. To revert schema changes:
- Create a new migration with inverse operations
- Execute new migration to restore old schema
Recommendation: Keep _helios_old_{table} for 24 hours before dropping.
5.3 Rollback Safety Checks
pub struct RollbackSafetyChecker { migration_state: MigrationState, cutover_timestamp: Option<DateTime<Utc>>, old_table_exists: bool, changelog_exists: bool,}
impl RollbackSafetyChecker { pub fn can_rollback(&self) -> Result<RollbackFeasibility> { match self.migration_state { MigrationState::Planning | MigrationState::Preparing | MigrationState::Copying => { Ok(RollbackFeasibility::Safe { estimated_time_seconds: 10, data_loss_risk: DataLossRisk::None, }) }
MigrationState::Synchronizing | MigrationState::Validating | MigrationState::ReadyForCutover => { Ok(RollbackFeasibility::Safe { estimated_time_seconds: 15, data_loss_risk: DataLossRisk::Low, }) }
MigrationState::Cutover => { if let Some(cutover_time) = self.cutover_timestamp { let elapsed = Utc::now() - cutover_time; if elapsed <= Duration::minutes(5) { Ok(RollbackFeasibility::Risky { estimated_time_seconds: 30, data_loss_risk: DataLossRisk::Low, warnings: vec![ "Within emergency rollback window".to_string(), "Some queries may fail during rollback".to_string(), ], }) } else { Err(Error::RollbackWindowExpired) } } else { Err(Error::InvalidState) } }
MigrationState::Completed => { if self.old_table_exists { Ok(RollbackFeasibility::Possible { estimated_time_seconds: 300, data_loss_risk: DataLossRisk::High, warnings: vec![ "Migration completed, old table still exists".to_string(), "Rollback will require new migration".to_string(), ], }) } else { Err(Error::CannotRollback { reason: "Old table dropped, rollback impossible".to_string(), }) } }
_ => Err(Error::InvalidState), } }}6. Migration Testing Framework
6.1 Shadow Traffic Validation
Before cutover, HeliosDB can validate the ghost table using shadow traffic:
┌─────────────────────────────────────────────────────────────┐│ Application Traffic │└───────────────────────┬─────────────────────────────────────┘ │ ▼ ┌──────────────────────────────┐ │ Migration Coordinator │ │ (Shadow Mode Enabled) │ └───────┬──────────────┬───────┘ │ │ Primary │ │ Shadow (read-only) ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Original Table │ │ Ghost Table │ │ (users) │ │ (_new_users) │ └─────────────────┘ └─────────────────┘ │ │ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ Return Result │ │ Compare Result │ │ to Application │ │ Log Diff │ └─────────────────┘ └─────────────────┘Shadow Traffic Mode:
- All read queries executed against BOTH tables
- Results compared for consistency
- Differences logged for analysis
- Only original table results returned to application
- Zero impact on application (async comparison)
Validation Metrics:
- Consistency Rate: % of queries with matching results
- Performance Delta: Ghost table vs original (latency, throughput)
- Data Completeness: Row count, checksum comparison
6.2 Automated Testing Before Cutover
pub struct MigrationTestSuite { migration_id: String, original_table: String, ghost_table: String,}
impl MigrationTestSuite { /// Run comprehensive pre-cutover tests pub async fn run_all_tests(&self) -> Result<TestReport> { let mut report = TestReport::new();
// Test 1: Row count validation report.add_result(self.test_row_count().await?);
// Test 2: Checksum comparison report.add_result(self.test_checksums().await?);
// Test 3: Sample-based deep comparison report.add_result(self.test_sample_comparison(1000).await?);
// Test 4: Index validation report.add_result(self.test_index_consistency().await?);
// Test 5: Constraint validation report.add_result(self.test_constraints().await?);
// Test 6: Performance comparison report.add_result(self.test_performance().await?);
// Test 7: Shadow traffic validation (if enabled) if self.is_shadow_mode_enabled() { report.add_result(self.test_shadow_traffic().await?); }
Ok(report) }
async fn test_row_count(&self) -> Result<TestResult> { let original_count = query_one::<i64>(&format!( "SELECT COUNT(*) FROM {}", self.original_table )).await?;
let ghost_count = query_one::<i64>(&format!( "SELECT COUNT(*) FROM {}", self.ghost_table )).await?;
Ok(TestResult { name: "Row Count Validation".to_string(), passed: original_count == ghost_count, message: format!( "Original: {}, Ghost: {}", original_count, ghost_count ), }) }
async fn test_checksums(&self) -> Result<TestResult> { // Compute checksums for 100 chunks let chunk_size = self.get_table_size().await? / 100; let mut mismatches = Vec::new();
for i in 0..100 { let offset = i * chunk_size; let original_checksum = self.compute_checksum( &self.original_table, offset, chunk_size ).await?; let ghost_checksum = self.compute_checksum( &self.ghost_table, offset, chunk_size ).await?;
if original_checksum != ghost_checksum { mismatches.push(i); } }
Ok(TestResult { name: "Checksum Comparison".to_string(), passed: mismatches.is_empty(), message: if mismatches.is_empty() { "All chunks match".to_string() } else { format!("Mismatched chunks: {:?}", mismatches) }, }) }
async fn test_sample_comparison(&self, sample_size: usize) -> Result<TestResult> { // Randomly sample rows and compare let sample_ids = self.get_random_sample_ids(sample_size).await?; let mut mismatches = 0;
for id in sample_ids { let original_row = self.get_row(&self.original_table, id).await?; let ghost_row = self.get_row(&self.ghost_table, id).await?;
if original_row != ghost_row { mismatches += 1; } }
Ok(TestResult { name: "Sample Comparison".to_string(), passed: mismatches == 0, message: format!( "Sampled {} rows, {} mismatches", sample_size, mismatches ), }) }
async fn test_performance(&self) -> Result<TestResult> { // Run representative queries on both tables let queries = vec![ "SELECT * FROM {} WHERE id = 12345", "SELECT COUNT(*) FROM {} WHERE created_at > NOW() - INTERVAL '1 day'", "SELECT AVG(value) FROM {} GROUP BY category", ];
let mut original_times = Vec::new(); let mut ghost_times = Vec::new();
for query_template in queries { let original_query = query_template.replace("{}", &self.original_table); let ghost_query = query_template.replace("{}", &self.ghost_table);
let original_time = measure_query_time(&original_query).await?; let ghost_time = measure_query_time(&ghost_query).await?;
original_times.push(original_time); ghost_times.push(ghost_time); }
let avg_original = original_times.iter().sum::<f64>() / original_times.len() as f64; let avg_ghost = ghost_times.iter().sum::<f64>() / ghost_times.len() as f64; let performance_delta = ((avg_ghost - avg_original) / avg_original) * 100.0;
// Pass if ghost table is within 20% of original performance let passed = performance_delta.abs() < 20.0;
Ok(TestResult { name: "Performance Comparison".to_string(), passed, message: format!( "Original: {:.2}ms, Ghost: {:.2}ms, Delta: {:.1}%", avg_original, avg_ghost, performance_delta ), }) }}6.3 Migration Dry Run
// Execute migration in dry-run modelet migration = MigrationBuilder::new() .table("users") .add_column("phone", DataType::VarChar(20), nullable=true) .dry_run(true) // No actual changes .build();
let result = migration_engine.execute(migration).await?;
// Dry run provides:// - Estimated duration// - Resource requirements (disk space, I/O)// - Potential issues// - Rollback feasibility7. API Design
7.1 SQL Syntax
7.1.1 Start Online Migration
-- Explicit online migrationALTER TABLE users ADD COLUMN phone VARCHAR(20) NULL WITH (MIGRATION_MODE = 'ONLINE');
-- Shorthand (online by default)ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Force offline migration (faster for large tables with downtime tolerance)ALTER TABLE users ADD COLUMN phone VARCHAR(20) WITH (MIGRATION_MODE = 'OFFLINE');7.1.2 Migration Options
ALTER TABLE users MODIFY COLUMN user_id BIGINTWITH ( MIGRATION_MODE = 'ONLINE', CHUNK_SIZE = 10000, -- Rows per chunk THROTTLE_IO = 50, -- Max % of I/O capacity ENABLE_SHADOW_TRAFFIC = true, -- Validate with shadow traffic ROLLBACK_RETENTION = '24h' -- Keep old table for 24 hours);7.1.3 Monitor Migration
-- Show all active migrationsSHOW MIGRATIONS;
-- Show specific migration detailsSHOW MIGRATION 'migration-uuid-1234';
-- Output:-- Migration ID: migration-uuid-1234-- Table: users-- Operation: ADD COLUMN phone VARCHAR(20)-- Status: COPYING-- Progress: 67% (67M/100M rows)-- Elapsed Time: 00:15:32-- Estimated Remaining: 00:07:45-- Current Phase: Bulk Copy-- Rollback Available: Yes7.1.4 Control Migration
-- Pause migration (useful for maintenance windows)PAUSE MIGRATION 'migration-uuid-1234';
-- Resume paused migrationRESUME MIGRATION 'migration-uuid-1234';
-- Rollback migrationROLLBACK MIGRATION 'migration-uuid-1234';
-- Force cutover (skip validation)CUTOVER MIGRATION 'migration-uuid-1234' WITH (FORCE = true);7.2 Programmatic API (Rust)
use heliosdb::migration::{MigrationEngine, MigrationBuilder, MigrationMode};
// Create migration enginelet migration_engine = MigrationEngine::new(config);
// Build migrationlet migration = MigrationBuilder::new() .table("users") .add_column(ColumnDefinition { name: "phone".to_string(), data_type: DataType::VarChar(20), nullable: true, default: None, }) .mode(MigrationMode::Online) .chunk_size(10_000) .throttle_io_percent(50) .enable_shadow_traffic(true) .build()?;
// Execute migration asynchronouslylet migration_id = migration_engine.start(migration).await?;
// Monitor progresslet progress = migration_engine.get_progress(&migration_id).await?;println!("Progress: {}%", progress.percent_complete);
// Pause if neededmigration_engine.pause(&migration_id).await?;
// Resumemigration_engine.resume(&migration_id).await?;
// Rollbackmigration_engine.rollback(&migration_id).await?;7.3 REST API
# Start migrationcurl -X POST http://localhost:8080/api/migrations \ -H "Content-Type: application/json" \ -d '{ "table": "users", "operation": { "type": "ADD_COLUMN", "column": { "name": "phone", "type": "VARCHAR(20)", "nullable": true } }, "mode": "ONLINE", "options": { "chunk_size": 10000, "throttle_io_percent": 50, "enable_shadow_traffic": true } }'
# Response:# {# "migration_id": "migration-uuid-1234",# "status": "PLANNING",# "estimated_duration_seconds": 450# }
# Get migration statuscurl http://localhost:8080/api/migrations/migration-uuid-1234
# Pause migrationcurl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/pause
# Resume migrationcurl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/resume
# Rollback migrationcurl -X POST http://localhost:8080/api/migrations/migration-uuid-1234/rollback8. Performance Analysis
8.1 Overhead During Migration
| Phase | I/O Overhead | CPU Overhead | Latency Impact | Notes |
|---|---|---|---|---|
| Planning | 0% | <1% | 0ms | Metadata-only |
| Ghost Table Creation | ~5% | ~2% | 0ms | One-time |
| Trigger Installation | 0% | 0% | +0.1ms/write | Small overhead |
| Bulk Copy | 10-50%* | 5-15% | 0ms | Configurable throttle |
| Synchronization | 5-10% | 2-5% | +0.2ms/write | Changelog replication |
| Validation | 20-40% | 10-20% | 0ms | Background process |
| Cutover | 0% | <1% | 5-10ms | Brief lock |
*Throttle configurable via THROTTLE_IO parameter
8.2 Migration Duration Estimates
Assumptions:
- Table size: 100M rows
- Row size: 1KB average
- Total data: 100GB
- I/O capacity: 100MB/s
- Throttle: 50% (50MB/s allocated to migration)
| Phase | Duration | Calculation |
|---|---|---|
| Planning | <1 second | Metadata operations |
| Ghost Table Creation | ~1 second | DDL operation |
| Trigger Installation | <1 second | DDL operation |
| Bulk Copy | ~33 minutes | 100GB ÷ 50MB/s = 2000s ≈ 33 min |
| Synchronization | ~5 minutes | Catch up on incremental changes |
| Validation | ~10 minutes | Checksum + sampling |
| Cutover | <10 seconds | Rename + apply final changelog |
| Total | ~50 minutes | End-to-end |
Scaling:
- 10M rows: ~5 minutes
- 1B rows: ~8 hours
- 10B rows: ~3 days (consider partitioned migration)
8.3 Optimization Techniques
8.3.1 Adaptive Throttling
pub struct AdaptiveThrottle { target_io_percent: f64, current_io_usage: f64, adjustment_factor: f64,}
impl AdaptiveThrottle { pub fn adjust_chunk_size(&mut self, current_chunk_size: usize) -> usize { // Measure current I/O usage self.current_io_usage = self.measure_io_usage();
// Adjust chunk size based on I/O headroom let io_headroom = self.target_io_percent - self.current_io_usage;
if io_headroom > 10.0 { // Plenty of headroom, increase chunk size (current_chunk_size as f64 * 1.2) as usize } else if io_headroom < -5.0 { // Over budget, decrease chunk size (current_chunk_size as f64 * 0.8) as usize } else { // Within acceptable range current_chunk_size } }}8.3.2 Parallel Chunk Processing
For very large tables, process multiple chunks in parallel:
pub async fn parallel_copy( &self, table: &str, ghost_table: &str, total_rows: u64, chunk_size: usize, parallelism: usize,) -> Result<()> { let num_chunks = (total_rows as usize + chunk_size - 1) / chunk_size;
// Create channel for chunk IDs let (tx, rx) = mpsc::channel(num_chunks);
// Send all chunk IDs for chunk_id in 0..num_chunks { tx.send(chunk_id).await?; } drop(tx);
// Spawn worker tasks let mut handles = Vec::new(); for worker_id in 0..parallelism { let rx = rx.clone(); let table = table.to_string(); let ghost_table = ghost_table.to_string();
let handle = tokio::spawn(async move { while let Ok(chunk_id) = rx.recv().await { let offset = chunk_id * chunk_size; copy_chunk(&table, &ghost_table, offset, chunk_size).await?; } Ok::<(), Error>(()) });
handles.push(handle); }
// Wait for all workers for handle in handles { handle.await??; }
Ok(())}Performance: 2-4x faster for large tables (I/O bound)
8.3.3 Compression During Copy
// Compress data during copy to reduce I/OINSERT INTO _helios_ghost_users_20251109SELECT id, name, COMPRESS(large_text_field) as large_text_fieldFROM usersWHERE id BETWEEN ? AND ?;Benefit: 30-50% faster for tables with compressible data
9. Failure Modes & Recovery
9.1 Failure Scenarios
| Failure | Phase | Impact | Recovery |
|---|---|---|---|
| Process Crash | Any | Migration paused | Resume from last checkpoint |
| Disk Full | Copying | Migration paused | Free disk space, resume |
| Network Partition | Copying | Partial data loss | Resync from changelog |
| Validation Failure | Validating | Migration failed | Rollback, investigate |
| Cutover Timeout | Cutover | Partial cutover | Emergency rollback |
| Trigger Error | Synchronizing | Data inconsistency | Pause, resync, resume |
9.2 Recovery Strategies
9.2.1 Checkpoint-Based Resume
pub struct MigrationCheckpoint { migration_id: String, last_copied_pk: i64, // Last primary key copied last_applied_lsn: i64, // Last LSN applied from changelog rows_copied: u64, timestamp: DateTime<Utc>,}
impl MigrationEngine { pub async fn resume_from_checkpoint(&self, migration_id: &str) -> Result<()> { // Load checkpoint let checkpoint = self.load_checkpoint(migration_id).await?;
// Resume from last position match self.get_migration_state(migration_id).await? { MigrationState::Copying => { // Resume bulk copy from last_copied_pk self.resume_bulk_copy(migration_id, checkpoint.last_copied_pk).await?; } MigrationState::Synchronizing => { // Resume changelog application from last_applied_lsn self.resume_changelog_sync(migration_id, checkpoint.last_applied_lsn).await?; } _ => { return Err(Error::InvalidState); } }
Ok(()) }}9.2.2 Automatic Retry with Exponential Backoff
pub async fn copy_chunk_with_retry( &self, table: &str, ghost_table: &str, offset: u64, chunk_size: usize,) -> Result<()> { let mut retries = 0; let max_retries = 5;
loop { match self.copy_chunk(table, ghost_table, offset, chunk_size).await { Ok(_) => return Ok(()), Err(e) if retries < max_retries => { let backoff = Duration::from_secs(2_u64.pow(retries)); warn!("Chunk copy failed, retrying in {:?}: {}", backoff, e); tokio::time::sleep(backoff).await; retries += 1; } Err(e) => { error!("Chunk copy failed after {} retries: {}", retries, e); return Err(e); } } }}9.2.3 Health Checks
pub struct MigrationHealthChecker { migration_id: String,}
impl MigrationHealthChecker { pub async fn run_health_check(&self) -> HealthStatus { let mut issues = Vec::new();
// Check disk space if self.available_disk_space().await < self.estimated_space_needed().await { issues.push(HealthIssue::InsufficientDisk); }
// Check trigger status if !self.triggers_active().await { issues.push(HealthIssue::TriggersInactive); }
// Check changelog lag let lag = self.changelog_lag().await; if lag > Duration::from_secs(300) { issues.push(HealthIssue::ChangelogLagging); }
// Check data consistency if let Some(drift) = self.detect_data_drift().await { issues.push(HealthIssue::DataDrift(drift)); }
if issues.is_empty() { HealthStatus::Healthy } else { HealthStatus::Unhealthy(issues) } }}10. Integration with HeliosDB
10.1 Schema Management Integration
impl SchemaService { /// Apply schema change with zero-downtime migration pub async fn apply_schema_change( &self, table: &str, change: SchemaChange, options: MigrationOptions, ) -> Result<MigrationId> { // Check if online migration is needed let requires_migration = self.requires_migration(&change)?;
if requires_migration { // Use zero-downtime migration let migration_engine = self.get_migration_engine(); let migration = MigrationBuilder::new() .table(table) .change(change) .options(options) .build()?;
migration_engine.start(migration).await } else { // Fast path: metadata-only change self.apply_metadata_change(table, change).await } }}10.2 Transaction Manager Integration
impl TransactionManager { /// Route query to appropriate table during migration pub async fn execute_query( &self, query: &Query, ) -> Result<QueryResult> { let table = query.table();
// Check if migration in progress if let Some(migration) = self.get_active_migration(table).await? { match migration.state { MigrationState::Copying | MigrationState::Synchronizing | MigrationState::Validating => { // Route to original table self.execute_on_table(table, query).await }
MigrationState::Cutover => { // Brief lock during cutover // Wait for cutover to complete self.wait_for_cutover(&migration).await?; self.execute_on_table(table, query).await }
_ => { self.execute_on_table(table, query).await } } } else { // No migration, execute normally self.execute_on_table(table, query).await } }}10.3 Storage Engine Integration
impl LsmStorage { /// Support for ghost table operations pub async fn create_ghost_table( &self, original_table: &str, new_schema: &Schema, ) -> Result<String> { let ghost_table_name = format!( "_helios_ghost_{}_{}", original_table, Utc::now().timestamp() );
// Create new LSM tree with new schema self.create_table(&ghost_table_name, new_schema).await?;
Ok(ghost_table_name) }
/// Atomic table rename for cutover pub async fn atomic_rename( &self, old_name: &str, new_name: &str, ) -> Result<()> { // Use storage-level atomic rename // This is a metadata operation only self.metadata_store.rename_table(old_name, new_name).await }}10.4 Multi-Model Support
The migration system works across all HeliosDB data models:
10.4.1 Document Model
// Migrate document collection schemaALTER COLLECTION users ADD FIELD phone STRING NULL;
// Internally uses ghost collection approach// - Create ghost collection with new schema// - Copy documents with new field (set to null)// - Sync incremental changes// - Atomic cutover10.4.2 Graph Model
// Migrate vertex schemaALTER VERTEX User ADD PROPERTY phone STRING NULL;
// Ghost vertex table approach// - Create ghost vertex table// - Copy vertices with new property// - Preserve edges (reference by ID)// - Atomic cutover10.4.3 Time-Series Model
// Migrate time-series metric schemaALTER METRIC cpu_usage ADD TAG hostname STRING;
// Special handling for time-series:// - Only apply to new data (forward-only)// - Historical data remains unchanged// - No ghost table needed11. Edge Cases & Solutions
11.1 Large Tables (>1TB)
Challenge: Single migration takes too long (days)
Solution: Partitioned Migration
// Migrate table partition by partitionfor partition in table.partitions() { let migration = MigrationBuilder::new() .table(table.name()) .partition(partition.id()) .change(change.clone()) .build()?;
migration_engine.start(migration).await?;}
// Cutover all partitions atomicallymigration_engine.cutover_all(table.name()).await?;11.2 Complex Constraints
Challenge: Foreign key constraints between tables
Solution: Constraint-Aware Migration
// Detect foreign key constraintslet constraints = self.get_foreign_keys(table)?;
for constraint in constraints { // Temporarily disable foreign key checks during migration self.disable_constraint(&constraint)?;}
// Perform migrationmigration_engine.start(migration).await?;
// Re-enable and validate constraintsfor constraint in constraints { self.enable_constraint(&constraint)?; self.validate_constraint(&constraint)?;}11.3 Hot Partitions
Challenge: Some partitions have very high write rate
Solution: Intelligent Scheduling
pub struct PartitionMigrationScheduler { partitions: Vec<Partition>,}
impl PartitionMigrationScheduler { pub fn schedule(&self) -> Vec<MigrationTask> { // Measure write rate for each partition let write_rates = self.measure_write_rates();
// Schedule migrations during low-traffic periods let mut tasks = Vec::new();
for partition in &self.partitions { let low_traffic_window = self.find_low_traffic_window(partition); tasks.push(MigrationTask { partition: partition.clone(), scheduled_start: low_traffic_window.start, estimated_duration: self.estimate_duration(partition), }); }
tasks }}11.4 Schema Evolution Conflicts
Challenge: Multiple migrations on same table
Solution: Migration Queue
pub struct MigrationQueue { pending: VecDeque<Migration>, active: Option<Migration>,}
impl MigrationQueue { pub async fn enqueue(&mut self, migration: Migration) -> Result<()> { // Check for conflicts if let Some(active) = &self.active { if active.table == migration.table { // Queue migration for later self.pending.push_back(migration); return Ok(()); } }
// No conflict, start immediately self.start(migration).await }
pub async fn on_migration_complete(&mut self, migration_id: &str) -> Result<()> { self.active = None;
// Start next queued migration if let Some(next) = self.pending.pop_front() { self.start(next).await?; }
Ok(()) }}12. Testing Strategy
12.1 Unit Tests
#[cfg(test)]mod tests { use super::*;
#[tokio::test] async fn test_ghost_table_creation() { let engine = MigrationEngine::new(config); let ghost_table = engine.create_ghost_table("users", new_schema).await.unwrap(); assert!(ghost_table.starts_with("_helios_ghost_users_")); }
#[tokio::test] async fn test_chunked_copy() { let engine = MigrationEngine::new(config); engine.copy_chunk("users", ghost_table, 0, 10_000).await.unwrap(); let count = query_one::<i64>(&format!("SELECT COUNT(*) FROM {}", ghost_table)).await.unwrap(); assert_eq!(count, 10_000); }
#[tokio::test] async fn test_rollback_early_phase() { let engine = MigrationEngine::new(config); let migration_id = engine.start(migration).await.unwrap(); engine.rollback(&migration_id).await.unwrap(); // Verify ghost table is dropped assert!(!table_exists(&ghost_table_name)); }}12.2 Integration Tests
#[tokio::test]async fn test_end_to_end_migration() { // Setup: Create table with 1M rows let db = TestDatabase::new().await; db.execute("CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(255))").await.unwrap(); db.insert_rows("users", 1_000_000).await.unwrap();
// Start migration: Add column let migration_id = db.migrate( "ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL" ).await.unwrap();
// Wait for completion db.wait_for_migration(&migration_id).await.unwrap();
// Verify new schema let schema = db.get_schema("users").await.unwrap(); assert!(schema.has_column("email"));
// Verify data integrity let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap(); assert_eq!(count, 1_000_000);}
#[tokio::test]async fn test_migration_with_live_traffic() { let db = TestDatabase::new().await; db.execute("CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(255))").await.unwrap(); db.insert_rows("users", 100_000).await.unwrap();
// Start migration let migration_id = db.migrate( "ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL" ).await.unwrap();
// Simulate live traffic during migration let write_handle = tokio::spawn(async move { for i in 100_000..110_000 { db.execute(&format!( "INSERT INTO users (id, name) VALUES ({}, 'user_{}')", i, i )).await.unwrap(); tokio::time::sleep(Duration::from_millis(10)).await; } });
// Wait for both migration and writes db.wait_for_migration(&migration_id).await.unwrap(); write_handle.await.unwrap();
// Verify all writes are present let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap(); assert_eq!(count, 110_000);}12.3 Chaos Engineering Tests
#[tokio::test]async fn test_migration_resilience_to_failures() { let db = TestDatabase::new().await; db.insert_rows("users", 1_000_000).await.unwrap();
// Start migration let migration_id = db.migrate( "ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL" ).await.unwrap();
// Wait until 50% complete db.wait_for_progress(&migration_id, 50).await.unwrap();
// Simulate process crash db.crash_and_restart().await;
// Resume migration db.resume_migration(&migration_id).await.unwrap();
// Verify completion db.wait_for_migration(&migration_id).await.unwrap(); let count = db.query_one::<i64>("SELECT COUNT(*) FROM users").await.unwrap(); assert_eq!(count, 1_000_000);}
#[tokio::test]async fn test_rollback_during_cutover() { let db = TestDatabase::new().await; db.insert_rows("users", 100_000).await.unwrap();
let migration_id = db.migrate( "ALTER TABLE users MODIFY COLUMN id BIGINT" ).await.unwrap();
// Wait for cutover db.wait_for_state(&migration_id, MigrationState::Cutover).await.unwrap();
// Trigger emergency rollback db.rollback_migration(&migration_id).await.unwrap();
// Verify original schema restored let schema = db.get_schema("users").await.unwrap(); assert_eq!(schema.get_column("id").data_type, DataType::Int);}13. Monitoring & Observability
13.1 Metrics
pub struct MigrationMetrics { // Progress metrics pub rows_copied: Counter, pub rows_remaining: Gauge, pub percent_complete: Gauge,
// Performance metrics pub copy_throughput_rows_per_sec: Gauge, pub copy_throughput_bytes_per_sec: Gauge, pub io_wait_time_ms: Histogram,
// Resource metrics pub disk_space_used_bytes: Gauge, pub cpu_usage_percent: Gauge, pub memory_usage_bytes: Gauge,
// Reliability metrics pub retries: Counter, pub errors: Counter, pub checkpoint_saves: Counter,
// Validation metrics pub validation_mismatches: Counter, pub checksum_failures: Counter,}
impl MigrationMetrics { pub fn export_prometheus(&self) -> String { // Export metrics in Prometheus format format!( "# HELP heliosdb_migration_progress Migration progress percentage\n\ # TYPE heliosdb_migration_progress gauge\n\ heliosdb_migration_progress{{migration_id=\"{}\"}} {}\n\ \n\ # HELP heliosdb_migration_throughput Copy throughput in rows/sec\n\ # TYPE heliosdb_migration_throughput gauge\n\ heliosdb_migration_throughput{{migration_id=\"{}\"}} {}\n", self.migration_id, self.percent_complete.get(), self.migration_id, self.copy_throughput_rows_per_sec.get() ) }}13.2 Logging
// Structured logging for migrationsinfo!( migration_id = %migration_id, table = %table, operation = %operation, state = ?state, progress_percent = progress, "Migration progress update");
warn!( migration_id = %migration_id, chunk_id = chunk_id, error = %err, "Chunk copy failed, retrying");
error!( migration_id = %migration_id, reason = %reason, "Migration failed, initiating rollback");13.3 Dashboards
Recommended Grafana Dashboard Panels:
-
Migration Progress
- % Complete gauge
- Estimated time remaining
- Rows copied / total rows
-
Throughput
- Rows/sec line chart
- MB/sec line chart
-
Resource Usage
- CPU % line chart
- Memory usage line chart
- Disk I/O line chart
-
Health
- Error rate counter
- Retry rate counter
- Validation failures counter
14. Future Enhancements
14.1 AI-Driven Migration Optimization
Vision: Use ML to predict optimal migration parameters
pub struct AIOptimizer { model: MigrationOptimizationModel,}
impl AIOptimizer { pub fn predict_optimal_params( &self, table_stats: &TableStats, workload_pattern: &WorkloadPattern, ) -> MigrationParams { // ML model predicts: // - Optimal chunk size // - Best time to start migration // - Expected duration // - Risk of performance impact
let features = self.extract_features(table_stats, workload_pattern); let prediction = self.model.predict(features);
MigrationParams { chunk_size: prediction.chunk_size, throttle_io_percent: prediction.throttle, recommended_start_time: prediction.best_start_time, estimated_duration: prediction.duration, } }}14.2 Cross-Region Migration
Vision: Migrate schemas across geographically distributed clusters
pub struct CrossRegionMigration { source_region: Region, target_regions: Vec<Region>,}
impl CrossRegionMigration { pub async fn execute(&self, migration: Migration) -> Result<()> { // 1. Migrate in source region first let source_migration_id = self.migrate_region( &self.source_region, migration.clone() ).await?;
// 2. Replicate to target regions for region in &self.target_regions { self.replicate_migration(region, &source_migration_id).await?; }
// 3. Coordinated cutover across all regions self.global_cutover(migration.table()).await?;
Ok(()) }}14.3 Blue-Green Schema Deployment
Vision: Deploy schema changes with instant rollback capability
// Deploy new schema as "green" versionlet green_schema = db.deploy_schema_version("users", "v2", new_schema).await?;
// Route 10% of traffic to greendb.route_traffic("users", blue=90, green=10).await?;
// Monitor for issuesif green_schema.has_errors() { // Instant rollback db.route_traffic("users", blue=100, green=0).await?;} else { // Gradually shift traffic db.route_traffic("users", blue=0, green=100).await?;}15. Conclusion
The Zero-Downtime Migration Architecture enables HeliosDB to perform schema changes on live production systems without downtime, data loss, or performance degradation. Key advantages:
15.1 Benefits
- Zero Downtime: Applications continue operating during migrations
- Zero Data Loss: Transaction-consistent migration with validation
- Rollback Safety: One-click rollback with multiple safety checkpoints
- Performance Control: Throttling prevents overwhelming production systems
- Multi-Model Support: Works across relational, document, graph, and time-series models
- Distributed Ready: Designed for multi-region, multi-tenant deployments
15.2 Competitive Advantage
- vs gh-ost: Native distributed support, multi-model, AI optimization
- vs pt-online-schema-change: Better rollback, shadow traffic validation
- vs AWS DMS: Zero-downtime for schema changes (not just data migration)
- vs Liquibase/Flyway: Online execution, not just versioning
15.3 Production Readiness
This architecture is designed for Phase 2 implementation with clear deliverables:
- Month 1: Core migration engine (ghost table, triggers, copy)
- Month 2: Validation framework, rollback mechanisms
- Month 3: Performance optimization, testing, documentation
Estimated Development: 3 months, 2 senior engineers Investment: $450K ROI: Essential for enterprise adoption, enables $50M+ ARR
Appendix A: Core Implementation Skeleton
pub struct MigrationEngine { state_store: Arc<MigrationStateStore>, ghost_table_manager: Arc<GhostTableManager>, copy_engine: Arc<CopyEngine>, trigger_manager: Arc<TriggerManager>, validator: Arc<ValidatorEngine>, orchestrator: Arc<MigrationOrchestrator>,}
impl MigrationEngine { pub async fn start(&self, migration: Migration) -> Result<MigrationId> { // 1. Validate migration self.validate(&migration).await?;
// 2. Create migration record let migration_id = self.state_store.create(migration).await?;
// 3. Start orchestration self.orchestrator.execute(migration_id).await?;
Ok(migration_id) }
pub async fn get_progress(&self, migration_id: &MigrationId) -> Result<Progress> { self.state_store.get_progress(migration_id).await }
pub async fn rollback(&self, migration_id: &MigrationId) -> Result<()> { self.orchestrator.rollback(migration_id).await }}
pub struct MigrationOrchestrator { state_machine: Arc<StateMachine>,}
impl MigrationOrchestrator { pub async fn execute(&self, migration_id: MigrationId) -> Result<()> { loop { let state = self.state_machine.current_state(&migration_id).await?;
match state { State::Planning => self.plan(&migration_id).await?, State::Preparing => self.prepare(&migration_id).await?, State::Copying => self.copy(&migration_id).await?, State::Synchronizing => self.synchronize(&migration_id).await?, State::Validating => self.validate(&migration_id).await?, State::ReadyForCutover => break, State::Failed => return Err(Error::MigrationFailed), _ => return Err(Error::InvalidState), } }
// Wait for manual cutover or auto-cutover Ok(()) }}End of Document
Next Steps:
- Review with engineering team
- Prototype core components (ghost table, triggers)
- Define detailed API specifications
- Create implementation plan for Phase 2