F6.21 Tenant Replication API Specification
F6.21 Tenant Replication API Specification
REST and gRPC API Documentation
Feature ID: F6.21 Version: 6.0 Status: Design Phase Date: November 2, 2025 Last Updated: November 2, 2025
Table of Contents
- Overview
- REST API
- gRPC API
- WebSocket API
- Authentication & Authorization
- Error Handling
- Rate Limiting
- Versioning
- SDK Examples
1. Overview
1.1 API Architecture
graph TB subgraph "Client Layer" WEB[Web Dashboard] CLI[CLI Tool] SDK[Language SDKs] end
subgraph "API Gateway" LB[Load Balancer] AUTH[Auth Middleware] RATELIMIT[Rate Limiter] end
subgraph "API Layer" REST[REST API<br/>Axum] GRPC[gRPC API<br/>Tonic] WS[WebSocket API] end
subgraph "Business Logic" CONTROLLER[Replication Controller] ORCHESTRATOR[Orchestrator] end
WEB --> LB CLI --> LB SDK --> LB
LB --> AUTH AUTH --> RATELIMIT
RATELIMIT --> REST RATELIMIT --> GRPC RATELIMIT --> WS
REST --> CONTROLLER GRPC --> CONTROLLER WS --> CONTROLLER
CONTROLLER --> ORCHESTRATOR
style REST fill:#3498DB style GRPC fill:#2ECC71 style WS fill:#E74C3C1.2 Base URLs
| Environment | REST API Base URL | gRPC Endpoint |
|---|---|---|
| Production | https://api.heliosdb.com/v1 | grpc.heliosdb.com:443 |
| Staging | https://api-staging.heliosdb.com/v1 | grpc-staging.heliosdb.com:443 |
| Development | http://localhost:8080/v1 | localhost:50051 |
1.3 Protocol Selection Guide
| Use Case | Recommended Protocol | Reason |
|---|---|---|
| Web Dashboard | REST | Browser-friendly, simple |
| CLI Tool | REST | Easy to debug, curl-compatible |
| SDKs | gRPC | Type-safe, high performance |
| Real-time Monitoring | WebSocket | Live updates, low latency |
| Bulk Operations | gRPC Streaming | Efficient, lower overhead |
2. REST API
2.1 Replication Management
2.1.1 Create Replication
Endpoint: POST /replications
Description: Create a new replication configuration for a tenant.
Request:
{ "tenant_id": "tenant-123", "source": { "connection_id": "550e8400-e29b-41d4-a716-446655440001", "database": "production_db", "schema": "public" }, "target": { "connection_id": "550e8400-e29b-41d4-a716-446655440002", "database": "replica_db", "schema": "public" }, "config": { "qos_tier": "Premium", "max_lag_seconds": 5, "priority": 90, "table_filter": ["users.*", "orders.*", "products.*"], "row_filter": { "enabled": false }, "transforms": [ { "type": "AnonymizePII", "config": { "columns": ["email", "phone", "ssn"], "method": "Hash" } } ], "compression": { "enabled": true, "level": "Medium" }, "encryption": { "enabled": true, "algorithm": "AES256GCM" } }}Response (201 Created):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "tenant_id": "tenant-123", "status": "Initializing", "created_at": "2025-11-02T10:00:00Z", "source": { "connection_id": "550e8400-e29b-41d4-a716-446655440001", "database": "production_db", "schema": "public" }, "target": { "connection_id": "550e8400-e29b-41d4-a716-446655440002", "database": "replica_db", "schema": "public" }, "config": { "qos_tier": "Premium", "max_lag_seconds": 5, "priority": 90, "table_filter": ["users.*", "orders.*", "products.*"], "transforms": [...], "compression": {...}, "encryption": {...} }, "estimated_initial_sync_duration_hours": 2.5, "estimated_data_size_gb": 150.3}Error Responses:
400 Bad Request: Invalid configuration401 Unauthorized: Missing or invalid authentication403 Forbidden: Insufficient permissions409 Conflict: Replication already exists500 Internal Server Error: Server error
Example:
curl -X POST https://api.heliosdb.com/v1/replications \ -H "Authorization: Bearer ${TOKEN}" \ -H "Content-Type: application/json" \ -d '{ "tenant_id": "tenant-123", "source": {"connection_id": "550e8400-e29b-41d4-a716-446655440001"}, "target": {"connection_id": "550e8400-e29b-41d4-a716-446655440002"}, "config": {"qos_tier": "Premium", "max_lag_seconds": 5} }'2.1.2 Get Replication
Endpoint: GET /replications/{replication_id}
Description: Retrieve replication configuration and current status.
Response (200 OK):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "tenant_id": "tenant-123", "status": "Streaming", "health": "Healthy", "created_at": "2025-11-02T10:00:00Z", "updated_at": "2025-11-02T10:30:00Z", "source": {...}, "target": {...}, "config": {...}, "current_state": { "replication_lag_seconds": 2.3, "replication_lag_bytes": 4096, "last_checkpoint": { "lsn": 1234567890, "timestamp": "2025-11-02T10:29:55Z" }, "throughput": { "rows_per_second": 5432.1, "bytes_per_second": 1048576 } }, "statistics": { "total_rows_replicated": 15000000, "total_bytes_replicated": 5368709120, "uptime_seconds": 1800, "avg_compression_ratio": 4.2, "conflicts_detected": 0, "conflicts_resolved": 0 }}Error Responses:
404 Not Found: Replication not found401 Unauthorized: Missing or invalid authentication403 Forbidden: Insufficient permissions
2.1.3 List Replications
Endpoint: GET /replications
Description: List all replications with filtering and pagination.
Query Parameters:
tenant_id(optional): Filter by tenant IDstatus(optional): Filter by status (Initializing, Syncing, Streaming, Paused, Stopped)qos_tier(optional): Filter by QoS tierpage(optional, default: 1): Page numberpage_size(optional, default: 50, max: 100): Items per pagesort_by(optional, default: created_at): Sort fieldsort_order(optional, default: desc): Sort order (asc, desc)
Response (200 OK):
{ "data": [ { "id": "650e8400-e29b-41d4-a716-446655440003", "tenant_id": "tenant-123", "status": "Streaming", "qos_tier": "Premium", "replication_lag_seconds": 2.3, "created_at": "2025-11-02T10:00:00Z" }, { "id": "650e8400-e29b-41d4-a716-446655440004", "tenant_id": "tenant-456", "status": "Syncing", "qos_tier": "Standard", "replication_lag_seconds": 15.7, "created_at": "2025-11-02T09:00:00Z" } ], "pagination": { "page": 1, "page_size": 50, "total_pages": 3, "total_items": 127 }}Example:
curl -X GET "https://api.heliosdb.com/v1/replications?status=Streaming&qos_tier=Premium&page=1" \ -H "Authorization: Bearer ${TOKEN}"2.1.4 Update Replication
Endpoint: PATCH /replications/{replication_id}
Description: Update replication configuration (limited fields).
Request:
{ "config": { "qos_tier": "Premium", "max_lag_seconds": 3, "priority": 95, "table_filter": ["users.*", "orders.*", "products.*", "analytics.*"] }}Response (200 OK):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "tenant_id": "tenant-123", "status": "Streaming", "config": { "qos_tier": "Premium", "max_lag_seconds": 3, "priority": 95, "table_filter": ["users.*", "orders.*", "products.*", "analytics.*"] }, "updated_at": "2025-11-02T11:00:00Z"}Note: Some fields (source, target) cannot be updated. Use migration API instead.
2.1.5 Delete Replication
Endpoint: DELETE /replications/{replication_id}
Description: Stop and delete a replication configuration.
Query Parameters:
cleanup_target(optional, default: false): Delete target database
Response (204 No Content)
Error Responses:
404 Not Found: Replication not found409 Conflict: Replication is in Failover state (cannot delete)
Example:
curl -X DELETE "https://api.heliosdb.com/v1/replications/650e8400-e29b-41d4-a716-446655440003?cleanup_target=false" \ -H "Authorization: Bearer ${TOKEN}"2.2 Replication Control
2.2.1 Start Replication
Endpoint: POST /replications/{replication_id}/start
Description: Start or resume a paused replication.
Request (optional):
{ "from_checkpoint": "auto", "initial_sync_parallelism": 4}Response (200 OK):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "status": "Initializing", "message": "Replication starting. Initial sync in progress.", "estimated_completion": "2025-11-02T12:30:00Z"}2.2.2 Pause Replication
Endpoint: POST /replications/{replication_id}/pause
Description: Pause replication (temporary stop).
Response (200 OK):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "status": "Paused", "paused_at": "2025-11-02T11:30:00Z", "last_checkpoint": { "lsn": 1234567890, "timestamp": "2025-11-02T11:29:55Z" }}2.2.3 Stop Replication
Endpoint: POST /replications/{replication_id}/stop
Description: Stop replication permanently.
Response (200 OK):
{ "id": "650e8400-e29b-41d4-a716-446655440003", "status": "Stopped", "stopped_at": "2025-11-02T11:30:00Z", "final_statistics": { "total_rows_replicated": 15000000, "total_bytes_replicated": 5368709120, "uptime_seconds": 5400 }}2.3 Failover Management
2.3.1 Trigger Failover
Endpoint: POST /replications/{replication_id}/failover
Description: Manually trigger failover to promote replica.
Request:
{ "reason": "Planned maintenance on primary region", "force": false, "update_routing": true}Response (202 Accepted):
{ "failover_id": "750e8400-e29b-41d4-a716-446655440005", "replication_id": "650e8400-e29b-41d4-a716-446655440003", "status": "InProgress", "started_at": "2025-11-02T12:00:00Z", "estimated_duration_seconds": 30, "steps": [ {"step": "ValidateReplicaHealth", "status": "InProgress"}, {"step": "StopReplication", "status": "Pending"}, {"step": "PromoteReplica", "status": "Pending"}, {"step": "UpdateRouting", "status": "Pending"} ]}2.3.2 Get Failover Status
Endpoint: GET /failovers/{failover_id}
Description: Get failover progress and status.
Response (200 OK):
{ "id": "750e8400-e29b-41d4-a716-446655440005", "replication_id": "650e8400-e29b-41d4-a716-446655440003", "status": "Completed", "started_at": "2025-11-02T12:00:00Z", "completed_at": "2025-11-02T12:00:28Z", "duration_seconds": 28, "downtime_ms": 350, "steps": [ {"step": "ValidateReplicaHealth", "status": "Completed", "duration_ms": 1200}, {"step": "StopReplication", "status": "Completed", "duration_ms": 500}, {"step": "PromoteReplica", "status": "Completed", "duration_ms": 350}, {"step": "UpdateRouting", "status": "Completed", "duration_ms": 25000} ], "old_primary": { "connection_id": "550e8400-e29b-41d4-a716-446655440001", "region": "us-east-1" }, "new_primary": { "connection_id": "550e8400-e29b-41d4-a716-446655440002", "region": "us-west-2" }}2.3.3 List Failover History
Endpoint: GET /failovers
Description: List historical failovers.
Query Parameters:
replication_id(optional): Filter by replicationstatus(optional): Filter by statuspage,page_size: Pagination
Response (200 OK):
{ "data": [ { "id": "750e8400-e29b-41d4-a716-446655440005", "replication_id": "650e8400-e29b-41d4-a716-446655440003", "trigger": "Manual", "status": "Completed", "duration_seconds": 28, "downtime_ms": 350, "started_at": "2025-11-02T12:00:00Z" } ], "pagination": {...}}2.4 Migration Management
2.4.1 Start Migration
Endpoint: POST /migrations
Description: Start live tenant migration across regions.
Request:
{ "tenant_id": "tenant-123", "source_region": "us-east-1", "target_region": "us-west-2", "migration_config": { "bulk_copy_parallelism": 8, "cdc_lag_threshold_seconds": 1, "cutover_strategy": "Automatic", "cleanup_source": false }}Response (202 Accepted):
{ "id": "850e8400-e29b-41d4-a716-446655440006", "tenant_id": "tenant-123", "source_region": "us-east-1", "target_region": "us-west-2", "status": "InProgress", "current_phase": "BulkCopy", "started_at": "2025-11-02T13:00:00Z", "estimated_completion": "2025-11-02T15:30:00Z", "phases": [ { "phase": "BulkCopy", "status": "InProgress", "progress_percent": 42.5, "estimated_data_size_gb": 150.3 }, { "phase": "CDCCatchup", "status": "Pending" }, { "phase": "Cutover", "status": "Pending" } ]}2.4.2 Get Migration Status
Endpoint: GET /migrations/{migration_id}
Description: Get migration progress.
Response (200 OK):
{ "id": "850e8400-e29b-41d4-a716-446655440006", "tenant_id": "tenant-123", "status": "Completed", "started_at": "2025-11-02T13:00:00Z", "completed_at": "2025-11-02T15:25:00Z", "total_duration_seconds": 8700, "downtime_ms": 85, "data_transferred_gb": 150.3, "phases": [ { "phase": "BulkCopy", "status": "Completed", "duration_seconds": 7200, "data_size_gb": 150.3 }, { "phase": "CDCCatchup", "status": "Completed", "duration_seconds": 1495, "final_lag_seconds": 0.5 }, { "phase": "Cutover", "status": "Completed", "duration_ms": 85 } ]}2.4.3 Cancel Migration
Endpoint: POST /migrations/{migration_id}/cancel
Description: Cancel in-progress migration (before cutover).
Response (200 OK):
{ "id": "850e8400-e29b-41d4-a716-446655440006", "status": "Cancelled", "cancelled_at": "2025-11-02T14:00:00Z", "rollback_status": "Completed"}Error: Cannot cancel after cutover phase starts.
2.5 Metrics & Monitoring
2.5.1 Get Replication Metrics
Endpoint: GET /replications/{replication_id}/metrics
Description: Get time-series metrics for a replication.
Query Parameters:
start_time: Start timestamp (ISO 8601)end_time: End timestamp (ISO 8601)resolution: Data resolution (1m, 5m, 1h, 1d)
Response (200 OK):
{ "replication_id": "650e8400-e29b-41d4-a716-446655440003", "time_range": { "start": "2025-11-02T10:00:00Z", "end": "2025-11-02T11:00:00Z", "resolution": "1m" }, "metrics": { "replication_lag_seconds": [ {"timestamp": "2025-11-02T10:00:00Z", "value": 2.1}, {"timestamp": "2025-11-02T10:01:00Z", "value": 2.3}, {"timestamp": "2025-11-02T10:02:00Z", "value": 1.9} ], "throughput_rows_per_second": [ {"timestamp": "2025-11-02T10:00:00Z", "value": 5234.2}, {"timestamp": "2025-11-02T10:01:00Z", "value": 5432.1} ], "compression_ratio": [ {"timestamp": "2025-11-02T10:00:00Z", "value": 4.1}, {"timestamp": "2025-11-02T10:01:00Z", "value": 4.3} ] }}2.5.2 Get Aggregated Metrics
Endpoint: GET /metrics/aggregate
Description: Get aggregated metrics across all replications.
Query Parameters:
group_by: Grouping dimension (tenant_id, qos_tier, region)start_time,end_time: Time range
Response (200 OK):
{ "time_range": {...}, "groups": [ { "group_key": {"qos_tier": "Premium"}, "metrics": { "avg_replication_lag_seconds": 2.1, "p99_replication_lag_seconds": 4.8, "total_replications": 45, "healthy_replications": 44, "avg_compression_ratio": 4.2 } }, { "group_key": {"qos_tier": "Standard"}, "metrics": { "avg_replication_lag_seconds": 12.3, "p99_replication_lag_seconds": 28.5, "total_replications": 120, "healthy_replications": 118, "avg_compression_ratio": 3.8 } } ]}2.6 Health & Status
2.6.1 Health Check
Endpoint: GET /health
Description: Service health check.
Response (200 OK):
{ "status": "Healthy", "version": "6.0.0", "uptime_seconds": 345600, "checks": { "database": "Healthy", "message_queue": "Healthy", "worker_pool": "Healthy" }}2.6.2 Readiness Check
Endpoint: GET /ready
Description: Service readiness check (for Kubernetes).
Response (200 OK):
{ "ready": true}3. gRPC API
3.1 Service Definition (Protocol Buffers)
syntax = "proto3";
package heliosdb.replication.v1;
import "google/protobuf/timestamp.proto";import "google/protobuf/duration.proto";
// Replication serviceservice ReplicationService { // Replication management rpc CreateReplication(CreateReplicationRequest) returns (Replication); rpc GetReplication(GetReplicationRequest) returns (Replication); rpc ListReplications(ListReplicationsRequest) returns (ListReplicationsResponse); rpc UpdateReplication(UpdateReplicationRequest) returns (Replication); rpc DeleteReplication(DeleteReplicationRequest) returns (google.protobuf.Empty);
// Replication control rpc StartReplication(StartReplicationRequest) returns (StartReplicationResponse); rpc PauseReplication(PauseReplicationRequest) returns (PauseReplicationResponse); rpc StopReplication(StopReplicationRequest) returns (StopReplicationResponse);
// Streaming rpc StreamReplicationMetrics(StreamMetricsRequest) returns (stream ReplicationMetrics); rpc StreamReplicationEvents(StreamEventsRequest) returns (stream ReplicationEvent);}
// Failover serviceservice FailoverService { rpc TriggerFailover(TriggerFailoverRequest) returns (FailoverResponse); rpc GetFailoverStatus(GetFailoverStatusRequest) returns (FailoverStatus); rpc ListFailoverHistory(ListFailoverHistoryRequest) returns (ListFailoverHistoryResponse);}
// Migration serviceservice MigrationService { rpc StartMigration(StartMigrationRequest) returns (Migration); rpc GetMigrationStatus(GetMigrationStatusRequest) returns (Migration); rpc CancelMigration(CancelMigrationRequest) returns (CancelMigrationResponse); rpc ListMigrations(ListMigrationsRequest) returns (ListMigrationsResponse);}
// Messagesmessage Replication { string id = 1; string tenant_id = 2; ReplicationStatus status = 3; ConnectionInfo source = 4; ConnectionInfo target = 5; ReplicationConfig config = 6; ReplicationState current_state = 7; ReplicationStatistics statistics = 8; google.protobuf.Timestamp created_at = 9; google.protobuf.Timestamp updated_at = 10;}
message ConnectionInfo { string connection_id = 1; string database = 2; string schema = 3; string region = 4;}
message ReplicationConfig { QoSTier qos_tier = 1; int32 max_lag_seconds = 2; int32 priority = 3; repeated string table_filter = 4; RowFilter row_filter = 5; repeated Transform transforms = 6; CompressionConfig compression = 7; EncryptionConfig encryption = 8;}
enum QoSTier { BEST_EFFORT = 0; STANDARD = 1; PREMIUM = 2; SYNCHRONOUS = 3;}
enum ReplicationStatus { INITIALIZING = 0; SYNCING = 1; STREAMING = 2; PAUSED = 3; STOPPED = 4; ERROR = 5;}
message ReplicationState { double replication_lag_seconds = 1; int64 replication_lag_bytes = 2; Checkpoint last_checkpoint = 3; Throughput throughput = 4; string health = 5;}
message Checkpoint { int64 lsn = 1; google.protobuf.Timestamp timestamp = 2; int64 transaction_id = 3;}
message Throughput { double rows_per_second = 1; int64 bytes_per_second = 2; double transactions_per_second = 3;}
message ReplicationStatistics { int64 total_rows_replicated = 1; int64 total_bytes_replicated = 2; int64 uptime_seconds = 3; double avg_compression_ratio = 4; int32 conflicts_detected = 5; int32 conflicts_resolved = 6;}
message Transform { oneof transform_type { AnonymizePIITransform anonymize_pii = 1; AggregateTransform aggregate = 2; FilterTransform filter = 3; CompressColumnsTransform compress_columns = 4; }}
message AnonymizePIITransform { repeated string columns = 1; AnonymizationMethod method = 2;
enum AnonymizationMethod { HASH = 0; TOKENIZE = 1; REDACT = 2; GENERALIZE = 3; }}
message CompressionConfig { bool enabled = 1; CompressionLevel level = 2;
enum CompressionLevel { LOW = 0; MEDIUM = 1; HIGH = 2; }}
message EncryptionConfig { bool enabled = 1; string algorithm = 2;}
// Request/Response messagesmessage CreateReplicationRequest { string tenant_id = 1; ConnectionInfo source = 2; ConnectionInfo target = 3; ReplicationConfig config = 4;}
message GetReplicationRequest { string id = 1;}
message ListReplicationsRequest { string tenant_id = 1; ReplicationStatus status = 2; QoSTier qos_tier = 3; int32 page = 4; int32 page_size = 5;}
message ListReplicationsResponse { repeated Replication replications = 1; Pagination pagination = 2;}
message Pagination { int32 page = 1; int32 page_size = 2; int32 total_pages = 3; int32 total_items = 4;}
message StreamMetricsRequest { string replication_id = 1; int32 interval_seconds = 2;}
message ReplicationMetrics { string replication_id = 1; google.protobuf.Timestamp timestamp = 2; double replication_lag_seconds = 3; int64 replication_lag_bytes = 4; double rows_per_second = 5; int64 bytes_per_second = 6; double compression_ratio = 7; double cpu_usage_percent = 8; int64 memory_usage_bytes = 9;}
message ReplicationEvent { string replication_id = 1; google.protobuf.Timestamp timestamp = 2; string event_type = 3; string severity = 4; string message = 5; map<string, string> details = 6;}
// Failover messagesmessage TriggerFailoverRequest { string replication_id = 1; string reason = 2; bool force = 3; bool update_routing = 4;}
message FailoverResponse { string failover_id = 1; string replication_id = 2; string status = 3; google.protobuf.Timestamp started_at = 4; int32 estimated_duration_seconds = 5;}
message FailoverStatus { string id = 1; string replication_id = 2; string status = 3; google.protobuf.Timestamp started_at = 4; google.protobuf.Timestamp completed_at = 5; int32 duration_seconds = 6; int64 downtime_ms = 7; repeated FailoverStep steps = 8;}
message FailoverStep { string step = 1; string status = 2; int64 duration_ms = 3;}
// Migration messagesmessage StartMigrationRequest { string tenant_id = 1; string source_region = 2; string target_region = 3; MigrationConfig config = 4;}
message MigrationConfig { int32 bulk_copy_parallelism = 1; int32 cdc_lag_threshold_seconds = 2; string cutover_strategy = 3; bool cleanup_source = 4;}
message Migration { string id = 1; string tenant_id = 2; string source_region = 3; string target_region = 4; string status = 5; string current_phase = 6; google.protobuf.Timestamp started_at = 7; google.protobuf.Timestamp estimated_completion = 8; google.protobuf.Timestamp completed_at = 9; int32 total_duration_seconds = 10; int64 downtime_ms = 11; double data_transferred_gb = 12; repeated MigrationPhase phases = 13;}
message MigrationPhase { string phase = 1; string status = 2; double progress_percent = 3; int32 duration_seconds = 4;}3.2 gRPC Example Usage
Go Client:
package main
import ( "context" "log"
pb "github.com/heliosdb/api/replication/v1" "google.golang.org/grpc" "google.golang.org/grpc/credentials")
func main() { // Connect to gRPC server creds := credentials.NewClientTLSFromCert(nil, "") conn, err := grpc.Dial("grpc.heliosdb.com:443", grpc.WithTransportCredentials(creds)) if err != nil { log.Fatalf("Failed to connect: %v", err) } defer conn.Close()
client := pb.NewReplicationServiceClient(conn)
// Create replication req := &pb.CreateReplicationRequest{ TenantId: "tenant-123", Source: &pb.ConnectionInfo{ ConnectionId: "550e8400-e29b-41d4-a716-446655440001", Database: "production_db", }, Target: &pb.ConnectionInfo{ ConnectionId: "550e8400-e29b-41d4-a716-446655440002", Database: "replica_db", }, Config: &pb.ReplicationConfig{ QosTier: pb.QoSTier_PREMIUM, MaxLagSeconds: 5, Priority: 90, }, }
replication, err := client.CreateReplication(context.Background(), req) if err != nil { log.Fatalf("Failed to create replication: %v", err) }
log.Printf("Created replication: %s", replication.Id)
// Stream metrics stream, err := client.StreamReplicationMetrics(context.Background(), &pb.StreamMetricsRequest{ ReplicationId: replication.Id, IntervalSeconds: 5, }) if err != nil { log.Fatalf("Failed to stream metrics: %v", err) }
for { metrics, err := stream.Recv() if err != nil { log.Fatalf("Stream error: %v", err) }
log.Printf("Lag: %.2fs, Throughput: %.0f rows/s", metrics.ReplicationLagSeconds, metrics.RowsPerSecond) }}Python Client:
import grpcfrom heliosdb.api.replication.v1 import replication_pb2, replication_pb2_grpc
# Connectcredentials = grpc.ssl_channel_credentials()channel = grpc.secure_channel('grpc.heliosdb.com:443', credentials)client = replication_pb2_grpc.ReplicationServiceStub(channel)
# Create replicationrequest = replication_pb2.CreateReplicationRequest( tenant_id="tenant-123", source=replication_pb2.ConnectionInfo( connection_id="550e8400-e29b-41d4-a716-446655440001", database="production_db" ), target=replication_pb2.ConnectionInfo( connection_id="550e8400-e29b-41d4-a716-446655440002", database="replica_db" ), config=replication_pb2.ReplicationConfig( qos_tier=replication_pb2.QoSTier.PREMIUM, max_lag_seconds=5, priority=90 ))
replication = client.CreateReplication(request)print(f"Created replication: {replication.id}")
# Stream metricsfor metrics in client.StreamReplicationMetrics( replication_pb2.StreamMetricsRequest( replication_id=replication.id, interval_seconds=5 )): print(f"Lag: {metrics.replication_lag_seconds:.2f}s, " f"Throughput: {metrics.rows_per_second:.0f} rows/s")4. WebSocket API
4.1 Connection
Endpoint: wss://api.heliosdb.com/v1/ws
Authentication: Query parameter token=<JWT_TOKEN>
Example:
const ws = new WebSocket('wss://api.heliosdb.com/v1/ws?token=' + token);
ws.onopen = () => { console.log('Connected');};
ws.onmessage = (event) => { const data = JSON.parse(event.data); console.log('Received:', data);};4.2 Subscribe to Metrics
Message:
{ "action": "subscribe", "channel": "metrics", "filters": { "replication_id": "650e8400-e29b-41d4-a716-446655440003" }}Response Stream:
{ "channel": "metrics", "replication_id": "650e8400-e29b-41d4-a716-446655440003", "timestamp": "2025-11-02T10:00:00Z", "data": { "replication_lag_seconds": 2.3, "rows_per_second": 5432.1, "compression_ratio": 4.2 }}4.3 Subscribe to Events
Message:
{ "action": "subscribe", "channel": "events", "filters": { "tenant_id": "tenant-123", "severity": ["Error", "Critical"] }}Response Stream:
{ "channel": "events", "event_type": "ReplicationError", "severity": "Error", "replication_id": "650e8400-e29b-41d4-a716-446655440003", "timestamp": "2025-11-02T10:05:00Z", "message": "Connection timeout to target database", "details": { "error_code": "CONN_TIMEOUT", "retry_count": 3 }}5. Authentication & Authorization
5.1 JWT Authentication
Header:
Authorization: Bearer <JWT_TOKEN>JWT Payload:
{ "sub": "user-12345", "iss": "heliosdb.com", "aud": "heliosdb-api", "exp": 1730638800, "iat": 1730552400, "roles": ["Admin"], "tenant_id": "tenant-123", "permissions": [ "replication:create", "replication:read", "replication:update", "replication:delete", "failover:trigger" ]}5.2 API Key Authentication
Header:
X-API-Key: <API_KEY>Use Case: Service-to-service authentication
5.3 OAuth 2.0
Supported Flows:
- Authorization Code Flow (for web apps)
- Client Credentials Flow (for services)
Token Endpoint: POST /oauth/token
6. Error Handling
6.1 Error Response Format
{ "error": { "code": "REPLICATION_LAG_EXCEEDED", "message": "Replication lag exceeded maximum threshold", "details": { "replication_id": "650e8400-e29b-41d4-a716-446655440003", "current_lag_seconds": 120, "max_lag_seconds": 30 }, "request_id": "req-950e8400-e29b-41d4-a716-446655440007", "timestamp": "2025-11-02T10:00:00Z" }}6.2 Error Codes
| Code | HTTP Status | Description |
|---|---|---|
INVALID_REQUEST | 400 | Malformed request |
VALIDATION_ERROR | 400 | Validation failed |
UNAUTHORIZED | 401 | Missing or invalid auth |
FORBIDDEN | 403 | Insufficient permissions |
NOT_FOUND | 404 | Resource not found |
CONFLICT | 409 | Resource conflict |
REPLICATION_LAG_EXCEEDED | 422 | Lag too high |
FAILOVER_IN_PROGRESS | 422 | Cannot modify during failover |
RATE_LIMIT_EXCEEDED | 429 | Too many requests |
INTERNAL_ERROR | 500 | Internal server error |
SERVICE_UNAVAILABLE | 503 | Service temporarily unavailable |
6.3 Retry Strategy
Recommended Exponential Backoff:
retry_delay = min(max_delay, base_delay * 2^attempt)
base_delay = 1 secondmax_delay = 60 secondsmax_attempts = 5Retry on:
503 Service Unavailable429 Rate Limit Exceeded(useRetry-Afterheader)- Network errors
Do NOT retry on:
400 Bad Request401 Unauthorized403 Forbidden404 Not Found
7. Rate Limiting
7.1 Rate Limit Headers
Response Headers:
X-RateLimit-Limit: 1000X-RateLimit-Remaining: 995X-RateLimit-Reset: 17305525007.2 Rate Limits by Tier
| Tier | Requests/Minute | Burst |
|---|---|---|
| Free | 60 | 10 |
| Standard | 600 | 50 |
| Premium | 6000 | 200 |
| Enterprise | Custom | Custom |
7.3 Rate Limit Response
429 Too Many Requests:
{ "error": { "code": "RATE_LIMIT_EXCEEDED", "message": "Rate limit exceeded. Retry after 30 seconds.", "retry_after_seconds": 30 }}8. Versioning
8.1 API Versioning Strategy
URL Versioning: /v1/, /v2/, etc.
Deprecation Policy:
- 6 months notice before deprecation
- 12 months support after deprecation announcement
- Clear migration guide provided
8.2 Breaking Changes
What constitutes a breaking change:
- Removing or renaming fields
- Changing field types
- Changing status codes
- Removing endpoints
Non-breaking changes:
- Adding new fields (ignored by old clients)
- Adding new endpoints
- Adding new optional parameters
9. SDK Examples
9.1 TypeScript SDK
import { HeliosDBReplicationClient } from '@heliosdb/replication-sdk';
const client = new HeliosDBReplicationClient({ apiKey: process.env.HELIOSDB_API_KEY, baseUrl: 'https://api.heliosdb.com/v1'});
// Create replicationconst replication = await client.replications.create({ tenantId: 'tenant-123', source: { connectionId: '550e8400-e29b-41d4-a716-446655440001', database: 'production_db' }, target: { connectionId: '550e8400-e29b-41d4-a716-446655440002', database: 'replica_db' }, config: { qosTier: 'Premium', maxLagSeconds: 5, priority: 90 }});
console.log(`Created replication: ${replication.id}`);
// Start replicationawait client.replications.start(replication.id);
// Monitor metricsconst metricsStream = client.replications.streamMetrics(replication.id);metricsStream.on('data', (metrics) => { console.log(`Lag: ${metrics.replicationLagSeconds}s`);});9.2 Python SDK
from heliosdb_replication import ReplicationClient
client = ReplicationClient( api_key=os.environ['HELIOSDB_API_KEY'], base_url='https://api.heliosdb.com/v1')
# Create replicationreplication = client.replications.create( tenant_id='tenant-123', source={'connection_id': '550e8400-e29b-41d4-a716-446655440001'}, target={'connection_id': '550e8400-e29b-41d4-a716-446655440002'}, config={'qos_tier': 'Premium', 'max_lag_seconds': 5})
print(f"Created replication: {replication.id}")
# Start replicationclient.replications.start(replication.id)
# Monitor metricsfor metrics in client.replications.stream_metrics(replication.id): print(f"Lag: {metrics.replication_lag_seconds}s")9.3 Rust SDK
use heliosdb_replication::{ReplicationClient, CreateReplicationRequest};
#[tokio::main]async fn main() -> Result<()> { let client = ReplicationClient::new( env::var("HELIOSDB_API_KEY")?, "https://api.heliosdb.com/v1" );
// Create replication let replication = client.replications().create(CreateReplicationRequest { tenant_id: "tenant-123".into(), source: ConnectionInfo { connection_id: "550e8400-e29b-41d4-a716-446655440001".into(), ..Default::default() }, target: ConnectionInfo { connection_id: "550e8400-e29b-41d4-a716-446655440002".into(), ..Default::default() }, config: ReplicationConfig { qos_tier: QosTier::Premium, max_lag_seconds: 5, priority: 90, ..Default::default() }, }).await?;
println!("Created replication: {}", replication.id);
// Start replication client.replications().start(&replication.id).await?;
// Monitor metrics let mut stream = client.replications().stream_metrics(&replication.id).await?; while let Some(metrics) = stream.next().await { println!("Lag: {}s", metrics.replication_lag_seconds); }
Ok(())}10. OpenAPI Specification
Full OpenAPI 3.1 specification available at:
https://api.heliosdb.com/v1/openapi.yaml
Interactive Swagger UI:
https://api.heliosdb.com/docs
Document Version: 1.0 Status: Draft for Review Authors: API Design Team Last Updated: November 2, 2025
HeliosDB Tenant Replication API - Comprehensive, type-safe, developer-friendly