Skip to content

Edge Deployment — Offline-First Database with Cloud Sync

UVP

Deploy HeliosDB Full to stores, factories, vehicles, and mobile apps that may be offline for hours and still need transactional reads and writes. The edge subsystem provides <1 ms offline queries, bidirectional CRDT sync (7 CRDT types, 90%+ bandwidth reduction via delta sync), local ML inference for predictions without round-tripping to the cloud, and a DuckDB-compatible embedded engine for local analytics. Same SQL surface in the data centre and on a Raspberry Pi — the sync engine handles conflicts automatically.


Prerequisites

  • HeliosDB Full v8.0.3 (the edge crates are part of the workspace; no separate binary)
  • A central / cloud HeliosDB instance to sync with
  • An edge target — mobile, IoT device, retail POS terminal, or anything that runs Rust + libc
  • ~30 minutes

There are four edge sub-crates. You compose the ones you need; you do not need to ship them all.


1. The Four Edge Crates

CrateWhat it gives youWhen to ship it
edge-computeRun queries locally on the edgeAlways
edge-syncBidirectional CRDT sync to the cloudAlways for connected edge
edge-aiLocal ML inference (cached models, incremental learning)Recommendation/anomaly use cases
embedded-cloudDuckDB-compatible WASM analytics engine + hybrid query routingAnalytics workloads on the edge

edge-compute, edge-ai and embedded-cloud are workspace-internal crates. The most fully documented is edge-sync (extensive README, patent disclosure on the bi-directional protocol). embedded-cloud is documented as “Innovation #6 of HeliosDB v7.0”.


2. Quick Start — A POS Terminal

The canonical use case: a 1000-store retail chain. Each register must process sales offline (no Wi-Fi during a router outage shouldn’t take down checkout) and sync to HQ when connectivity returns.

use heliosdb_edge::{EdgeEngine, EdgeConfig};
let config = EdgeConfig {
node_id: "store-42-register-3".into(),
cloud_endpoint: Some("https://retail-hq.example.com".into()),
sync_interval: 60, // sync every minute when online
max_storage_bytes: 10 * 1024 * 1024 * 1024, // 10 GB
offline_mode: false, // auto-detect
cache_config: Default::default(),
};
let mut engine = EdgeEngine::new(config);
engine.start().await?;
// Process sales OFFLINE — no cloud required
insert_sale(&sale).await?;
engine.enqueue_sync(SyncData::from(&sale))?;
// Auto-sync when online (CRDT conflict resolution)
let status = engine.sync().await?;
println!("Synced {} bytes ({} items)", status.bytes_synced, status.items);

Performance characteristics shipped with the crate:

  • Offline query latency: <1 ms
  • Sync latency: <50 ms
  • Bandwidth reduction: 90%+ (delta sync + LZ4/Zstd compression)

3. The Sync Protocol — edge-sync

This is the most production-hardened edge crate (separate patent disclosure, 70 unit + 21 integration + 13 chaos tests). The README documents it as F5.2.5.

Configure the sync manager

use heliosdb_edge_sync::*;
let config = manager::SyncConfig {
node_id: "edge-device-1".into(),
cloud_node_id: "cloud-primary".into(),
conflict_strategy: conflict::ConflictStrategy::LastWriteWins,
compression: compression::CompressionAlgorithm::Lz4,
scheduling: scheduler::SchedulingStrategy::Adaptive,
bidirectional: true,
..Default::default()
};
let mgr = manager::EdgeSyncManager::new(config);

Conflict resolution strategies

StrategyWhen to use
LastWriteWinsDefault; correct for most user-facing data
FirstWriteWinsImmutable / append-only records
MaxValueCounters, gauges (don’t lose increments)
MinValueBounds, capacity reservations
CustomDomain-specific merge logic
ManualCritical data — flag for review, never auto-resolve

You can set a strategy globally and override per table:

let mut resolver = conflict::ConflictResolver::new(conflict::ConflictStrategy::LastWriteWins);
resolver.set_table_strategy("page_views".into(), conflict::ConflictStrategy::MaxValue);

Adaptive scheduling

The sync interval changes with network quality so devices don’t burn battery dialling out on a flaky 2G connection:

scheduler::NetworkCondition::Excellent => 10s interval
scheduler::NetworkCondition::Good => 30s
scheduler::NetworkCondition::Poor => 5min
scheduler::NetworkCondition::Offline => no sync (queue locally)

Offline queue

The local write queue holds 10,000+ operations (configurable to 100,000+). When connectivity returns, the queue drains in priority order with batched dequeues:

let queue = offline::OfflineQueueManager::new("edge-device-1".into(), offline::OfflineQueueConfig {
max_capacity: 50_000,
enable_priority: true,
max_retries: 3,
..Default::default()
});
queue.enqueue_with_priority(change, 5)?; // priority 5
let batch = queue.dequeue_batch(1000); // drain 1000 at a time

Compression — pick your algorithm

AlgorithmSpeedRatioWhen to use
LZ4 (default)Very fast2-3×Mobile/IoT — CPU-bound
ZstdFast3-5×Backhaul-bound, abundant CPU

Real-world bandwidth savings: 85-95% on structured data, 70-85% on JSON, 0-10% on already-compressed media.


4. CRDT Types — The Conflict-Free Foundation

The edge engine includes all seven canonical CRDT types:

CRDTUse case
G-CounterGrow-only counter (page views)
PN-CounterPositive-negative counter (inventory adjustments)
OR-SetObserved-remove set (tag systems)
LWW-RegisterLast-write-wins register (user profile fields)
RGAReplicated growable array (collaborative lists)
LSeqList CRDT with tombstones (ordered collaboration)
ORMapObserved-remove map (key-value with concurrent edits)

You don’t usually instantiate CRDTs directly — the sync layer chooses them based on column type and the conflict strategy you set.


5. Edge AI — Local Inference Without a Round-Trip

edge-ai runs ML inference on the edge node so predictions don’t require cloud latency. The crate is documented as Active (single-paragraph README) — for production use, treat it as a working subsystem with limited public API surface and prefer narrow, well-tested model invocations until the documentation surface matures.

Typical pattern (consult the source for current API):

// Pseudo-code — verify against current API
let model = edge_ai::load_cached_model("anomaly-v3").await?;
let score = model.predict(&features).await?;
if score > 0.85 {
flag_transaction_for_review(&txn);
}

Local model caching means a 50 MB anomaly model loads once, lives in RAM, and serves predictions in <2 ms — vs. 200-500 ms for a cloud round-trip on a 4G network.


6. embedded-cloud — DuckDB-Compatible Local Analytics

The fourth edge crate is the most ambitious. It ships a DuckDB-compatible local engine (100% SQL dialect, including List/Struct types, ASOF joins, Pivot/Unpivot) with hybrid query routing — a cost-based optimizer decides whether to run a query locally or push it to the cloud.

use heliosdb_embedded_cloud::{EmbeddedEngine, EmbeddedConfig};
let config = EmbeddedConfig {
max_memory_mb: 512,
enable_cloud_sync: true,
offline_mode: false,
..Default::default()
};
let engine = EmbeddedEngine::new(config).await?;
// Engine decides: local (fast, free) or cloud (more data, more cost)
let result = engine.execute("SELECT * FROM sales WHERE date >= '2026-01-01'").await?;
engine.sync().await?;

Tunables:

EmbeddedConfig {
max_memory_mb: 512,
cache_size_mb: 128,
enable_cloud_sync: true,
sync_interval_secs: 60,
offline_mode: false,
compression_level: 6, // zstd
local_storage_quota_mb: 1024,
prefer_local: true,
max_cloud_cost_usd: 0.10, // budget guard rail
}

Performance targets:

  • 100% DuckDB SQL-dialect compatibility
  • <1 s sync latency for small datasets (<10 MB)
  • 10× faster than cloud-only for local-eligible queries
  • 100% of CRUD operations work offline
  • 3:1 compression
  • <5% battery impact on mobile

7. Use Cases by Industry

Retail POS — 1000 stores, intermittent Wi-Fi

EdgeConfig {
node_id: format!("store-{}-register-{}", store, register),
cloud_endpoint: Some(hq_url),
sync_interval: 60,
max_storage_bytes: 10 * GB,
..Default::default()
}

Pattern: register processes sales locally; offline queue holds 24h+ of transactions; sync drains when Wi-Fi recovers.

IoT sensor network

SyncConfig {
node_id: format!("sensor-{}", device_id),
scheduling: SchedulingStrategy::Adaptive,
compression: CompressionAlgorithm::Lz4,
offline_queue: OfflineQueueConfig { max_capacity: 100_000, ..Default::default() },
..Default::default()
}

Pattern: thousands of cheap devices, intermittent backhaul, large local capacity.

Mobile app

SyncConfig {
node_id: format!("mobile-{}", user_id),
scheduling: SchedulingStrategy::OnChange,
conflict_strategy: ConflictStrategy::LastWriteWins,
bidirectional: true,
..Default::default()
}

Pattern: per-user replica, sync on app foreground / WiFi.

Multi-region warm standby

SyncConfig {
node_id: "us-west".into(),
cloud_node_id: "eu-central".into(),
conflict_strategy: ConflictStrategy::Custom("business-rules".into()),
compression: CompressionAlgorithm::Zstd,
..Default::default()
}

8. Observability

Per-edge-node Prometheus metrics:

edge_sync_total
edge_sync_success
edge_sync_failed
edge_sync_conflicts_detected
edge_sync_conflicts_resolved
edge_sync_queue_size
edge_sync_bandwidth_saved_bytes
edge_sync_duration_seconds
edge_sync_conflict_resolution_duration_seconds
edge_sync_compression_ratio
edge_sync_delta_size_bytes

For fleets in the thousands, scrape via a regional aggregator and forward to your central Prometheus.


9. Production Checklist

  • Conflict strategy is correct per table (counters need MaxValue, not LastWriteWins)
  • Offline queue capacity ≥ longest expected offline window × write rate
  • Compression algorithm matches CPU/bandwidth trade-off (LZ4 mobile, Zstd server-side)
  • Adaptive scheduler is enabled (battery + reliability win)
  • HTTPS / mTLS to the cloud endpoint — never plain HTTP
  • Metrics scraping wired up; alert on edge_sync_conflicts_detected rate spikes (often signals app bug, not network issue)
  • Cloud-side has rate limits per node_id so a misbehaving edge cannot DDoS your sync endpoint
  • Fleet deployment uses the same edge-sync version everywhere — protocol is forward-compatible but mixing >1 minor version isn’t tested

10. Verification Status

CrateDocumented asTest coverage
edge-computeActive (single-paragraph README)Tests present in workspace; surface API not deeply documented yet
edge-syncProduction (F5.2.5, full README, 100+ tests)70 unit + 21 integration + 13 chaos
edge-aiActive (single-paragraph README)Verify against source before committing to API
embedded-cloudInnovation #6 of HeliosDB v7.0; production-ready in own READMEUnit + integration suites present

If you’re building the production fleet today, start with edge-sync and edge-compute; pilot edge-ai and embedded-cloud against your specific workload before going wide.


Where Next

  • CDC & Streaming — feed cloud-side CDC events back to edge fleets
  • WASM Triggers — run database triggers on the edge in JS/Python/Rust
  • Multi-Tenancy — multi-tenant edge fleets (per-store tenant isolation)

References

  • Source: /home/app/Helios/Full/heliosdb-edge/crates/{edge-compute,edge-sync,edge-ai,embedded-cloud}/
  • edge-sync patent disclosure: heliosdb-edge/crates/edge-sync/PATENT_DISCLOSURE.md
  • embedded-cloud patent: “Hybrid Local-Cloud Database with Automatic Query Routing” ($15M-$22M valuation)