Federated Learning
Federated Learning — Train Across 100+ Nodes Without Sharing Raw Data
Crate: heliosdb-federation/crates/federated-learning
Modules: 27 — including coordinator, aggregator, trainer, worker, privacy, homomorphic_encryption, smpc, zkp, compliance_gdpr, compliance_hipaa, vertical_fl, transfer_learning
Status: “Production-Ready” per docs/FEDERATED_LEARNING.md. 300+ node scalability tests included.
UVP
Centralized training is where compliance officers say no. The Full edition ships a federated learning platform that lets you train ML models across 100+ database nodes — each on its own data, in its own region, under its own jurisdiction — and only model updates ever cross the network. 8+ aggregation strategies (FedAvg, FedProx, FedYogi, FedAdam, weighted, Byzantine-robust, secure aggregation, hierarchical), built-in differential privacy with budget tracking, GDPR + HIPAA compliance modules, plus optional homomorphic encryption and zero-knowledge proofs when “differential privacy” isn’t enough on the regulator’s checklist.
Prerequisites
- A coordinator host plus ≥3 worker nodes (more is more interesting).
- Local training data on each worker — the whole point is that it doesn’t move.
- A common model architecture all workers can run.
- About 30 minutes.
1. Engine, Coordinator, Worker
The library is built around three layers:
| Layer | Type | Job |
|---|---|---|
| Engine | FederatedLearningEngine | Top-level facade, owns models, manages coordinators and workers |
| Coordinator | Coordinator | Per-model: runs rounds, selects nodes, checks convergence |
| Worker | FederatedWorker | Per-node: trains locally, ships updates, receives global model |
From src/lib.rs:
use heliosdb_federated_learning::{Config, FederatedLearningEngine};use heliosdb_federated_learning::aggregator::AggregationStrategy;use heliosdb_federated_learning::coordinator::RoundConfig;
let config = Config { aggregation_strategy: AggregationStrategy::FedAvg, round_config: RoundConfig::default(), enable_privacy: false, privacy_config: None,};
let engine = FederatedLearningEngine::new(config).await?;2. Register a Model
let metadata = engine.register_model( "fraud_detector".to_string(), "linear".to_string(), // architecture name vec![0.1, 0.2, 0.3, /* ... */], // initial parameters).await?;
println!("Model registered: {} (version {})", metadata.id, metadata.version);The architecture string is opaque to the coordinator — it’s what FederatedWorker uses to know how to instantiate the local model. Each worker must have a matching architecture handler.
3. Pick an Aggregation Strategy
From the docs:
| Strategy | When |
|---|---|
FedAvg | Default. Simple averaging, IID data. (McMahan et al. 2017) |
FedProx | Heterogeneous (non-IID) data. Adds proximal term. (Li et al. 2020) |
FedYogi | Adaptive optimization with Yogi updates. (Reddi et al. 2021) |
FedAdam | Adaptive optimization with Adam updates. |
WeightedAverage | Weight by sample count per worker. |
use heliosdb_federated_learning::aggregator::AggregationStrategy;
let config = Config { aggregation_strategy: AggregationStrategy::FedProx, ..Default::default()};For Byzantine-robust deployments (untrusted workers), use the secure aggregator instead — see Section 8.
4. Add Differential Privacy
use heliosdb_federated_learning::privacy::DifferentialPrivacyConfig;
let dp = DifferentialPrivacyConfig { epsilon: 1.0, // privacy budget per round delta: 1e-5, // failure probability clip_norm: 1.0, // gradient clipping noise_multiplier: 1.1, // Gaussian noise scale ..Default::default()};
let config = Config { aggregation_strategy: AggregationStrategy::FedAvg, enable_privacy: true, privacy_config: Some(dp), ..Default::default()};The privacy::PrivacyManager tracks budget consumption per round. When you exhaust the budget, the coordinator stops the run rather than silently degrade the guarantee.
5. Start Training
engine.start_training("fraud_detector".to_string()).await?;The coordinator now:
- Selects a subset of workers per round (
NodeSelectionStrategy). - Ships the current global model.
- Each worker trains locally.
- Workers ship parameter updates back.
- The aggregator combines updates per the strategy.
- New global model → next round.
Workers see engine.start_training only as “fetch global, train local, push update”. The orchestration is the coordinator’s job.
6. Compress the Updates
Bandwidth is usually the bottleneck. The crate ships several compression strategies:
use heliosdb_federated_learning::compression::{CompressionStrategy, ModelCompressor};
let compressor = ModelCompressor::new(CompressionStrategy::TopK { k: 1000 });// or Quantization { bits: 8 }// or Sparsification { threshold: 0.01 }Top-K and quantization can shrink updates 10-100x with minor accuracy loss. The aggregator deserializes any compressed update transparently.
7. Hierarchical Aggregation for 200+ Nodes
For very large fleets, flat aggregation overwhelms the coordinator. The hierarchical_aggregation module groups workers into clusters with intermediate aggregators:
use heliosdb_federated_learning::hierarchical_aggregation;// see module docs for cluster/region wiringThis is the same pattern Google uses for Gboard. The crate’s scalability_tests module includes 300+-node validation — see Section 11.
8. Byzantine-Robust + Secure Aggregation
When workers can’t be trusted (open consortium, edge devices, untrusted partners):
use heliosdb_federated_learning::secure_aggregation::{ ByzantineRobustAggregator, MaliciousDetector,};use heliosdb_federated_learning::advanced_security::{ AdvancedSecuritySystem, SecurityConfig,};
let aggregator = ByzantineRobustAggregator::new(/* config */);let detector = MaliciousDetector::new(/* config */);Plus advanced_security adds backdoor detection and certified robustness. Combine with secure aggregation (secure_aggregation module) to keep individual updates encrypted from the coordinator itself.
9. Heavyweight Privacy: Homomorphic Encryption + ZKP + SMPC
When DP isn’t enough:
use heliosdb_federated_learning::homomorphic_encryption::HomomorphicEncryption;use heliosdb_federated_learning::smpc::{SMPCConfig, ShamirSecretSharing};use heliosdb_federated_learning::zkp::{ZKPConfig, ZKPSystem};
// HE: aggregate on encrypted updates// SMPC: secret-share parameters across n participants// ZKP: prove update validity without revealing itThese are heavy — orders of magnitude slower than DP — but they’re there when the threat model demands it. See docs/FEDERATED_LEARNING.md for the full crypto chapter.
10. GDPR + HIPAA Compliance
Compliance modules are first-class:
use heliosdb_federated_learning::compliance_gdpr::GdprComplianceManager;use heliosdb_federated_learning::compliance_hipaa::HipaaComplianceManager;
let gdpr = GdprComplianceManager::new(/* config */);let report = gdpr.generate_compliance_report().await?;
let hipaa = HipaaComplianceManager::new(/* config */);let phi_audit = hipaa.audit_phi_access().await?;These produce the artefacts auditors actually want — consent records, data category logs, audit entries, exportable reports. Per the lib.rs comments, both modules landed in Week 11 of the build-out and cover the validation surface.
11. Use Cases (From Source)
Per docs/FEDERATED_LEARNING.md:
- Multi-Institution Healthcare — train diagnostic models across hospitals without sharing patient data.
- Financial Fraud Detection — collaborative across banks, customer privacy preserved.
- IoT/Edge AI — train across edge devices with limited connectivity.
- Cross-Organization Analytics — collaborative analytics maintaining competitive secrecy.
- Regulatory Compliance — GDPR/HIPAA-compliant ML.
12. Inference & Serving
After training, expose the model via the built-in serving API:
use heliosdb_federated_learning::model_serving::{ InferenceRequest, ModelServingEngine, ServingConfig,};
let serving = ModelServingEngine::new(ServingConfig::default()).await?;let response = serving.infer(InferenceRequest { /* ... */ }).await?;Supports REST and gRPC; A/B testing for model versions is built in (ABTestConfig).
13. SQL Interface
A SQL surface is registered via sql_interface::SQLFunctionRegistry:
SELECT predict('fraud_detector', features)FROM transactionsWHERE amount > 1000;This dispatches into the FederatedLearningAPI for in-database inference.
Where Next
- conversational-bi.md — natural language → SQL → results.
docs/FEDERATED_LEARNING.md— full design + API reference.- Sibling crates in
heliosdb-federation/crates/:federated-query(cross-cloud query federation),sovereignty(data residency).