Federated Learning

Crate: heliosdb-federation/crates/federated-learning Modules: 27 — including coordinator, aggregator, trainer, worker, privacy, homomorphic_encryption, smpc, zkp, compliance_gdpr, compliance_hipaa, vertical_fl, transfer_learning Status: “Production-Ready” per docs/FEDERATED_LEARNING.md. 300+ node scalability tests included.

UVP

Centralized training is where compliance officers say no. The Full edition ships a federated learning platform that lets you train ML models across 100+ database nodes — each on its own data, in its own region, under its own jurisdiction — and only model updates ever cross the network. 8+ aggregation strategies (FedAvg, FedProx, FedYogi, FedAdam, weighted, Byzantine-robust, secure aggregation, hierarchical), built-in differential privacy with budget tracking, GDPR + HIPAA compliance modules, plus optional homomorphic encryption and zero-knowledge proofs when “differential privacy” isn’t enough on the regulator’s checklist.

Prerequisites

A coordinator host plus ≥3 worker nodes (more is more interesting).
Local training data on each worker — the whole point is that it doesn’t move.
A common model architecture all workers can run.
About 30 minutes.

1. Engine, Coordinator, Worker

The library is built around three layers:

Layer	Type	Job
Engine	`FederatedLearningEngine`	Top-level facade, owns models, manages coordinators and workers
Coordinator	`Coordinator`	Per-model: runs rounds, selects nodes, checks convergence
Worker	`FederatedWorker`	Per-node: trains locally, ships updates, receives global model

From src/lib.rs:

use heliosdb_federated_learning::{Config, FederatedLearningEngine};
use heliosdb_federated_learning::aggregator::AggregationStrategy;
use heliosdb_federated_learning::coordinator::RoundConfig;

let config = Config {
    aggregation_strategy: AggregationStrategy::FedAvg,
    round_config: RoundConfig::default(),
    enable_privacy: false,
    privacy_config: None,
};

let engine = FederatedLearningEngine::new(config).await?;

2. Register a Model

let metadata = engine.register_model(
    "fraud_detector".to_string(),
    "linear".to_string(),               // architecture name
    vec![0.1, 0.2, 0.3, /* ... */],     // initial parameters
).await?;

println!("Model registered: {} (version {})", metadata.id, metadata.version);

The architecture string is opaque to the coordinator — it’s what FederatedWorker uses to know how to instantiate the local model. Each worker must have a matching architecture handler.

3. Pick an Aggregation Strategy

From the docs:

Strategy	When
`FedAvg`	Default. Simple averaging, IID data. (McMahan et al. 2017)
`FedProx`	Heterogeneous (non-IID) data. Adds proximal term. (Li et al. 2020)
`FedYogi`	Adaptive optimization with Yogi updates. (Reddi et al. 2021)
`FedAdam`	Adaptive optimization with Adam updates.
`WeightedAverage`	Weight by sample count per worker.

use heliosdb_federated_learning::aggregator::AggregationStrategy;

let config = Config {
    aggregation_strategy: AggregationStrategy::FedProx,
    ..Default::default()
};

For Byzantine-robust deployments (untrusted workers), use the secure aggregator instead — see Section 8.

4. Add Differential Privacy

use heliosdb_federated_learning::privacy::DifferentialPrivacyConfig;

let dp = DifferentialPrivacyConfig {
    epsilon: 1.0,           // privacy budget per round
    delta: 1e-5,            // failure probability
    clip_norm: 1.0,         // gradient clipping
    noise_multiplier: 1.1,  // Gaussian noise scale
    ..Default::default()
};

let config = Config {
    aggregation_strategy: AggregationStrategy::FedAvg,
    enable_privacy: true,
    privacy_config: Some(dp),
    ..Default::default()
};

The privacy::PrivacyManager tracks budget consumption per round. When you exhaust the budget, the coordinator stops the run rather than silently degrade the guarantee.

5. Start Training

engine.start_training("fraud_detector".to_string()).await?;

The coordinator now:

Selects a subset of workers per round (NodeSelectionStrategy).
Ships the current global model.
Each worker trains locally.
Workers ship parameter updates back.
The aggregator combines updates per the strategy.
New global model → next round.

Workers see engine.start_training only as “fetch global, train local, push update”. The orchestration is the coordinator’s job.

6. Compress the Updates

Bandwidth is usually the bottleneck. The crate ships several compression strategies:

use heliosdb_federated_learning::compression::{CompressionStrategy, ModelCompressor};

let compressor = ModelCompressor::new(CompressionStrategy::TopK { k: 1000 });
// or Quantization { bits: 8 }
// or Sparsification { threshold: 0.01 }

Top-K and quantization can shrink updates 10-100x with minor accuracy loss. The aggregator deserializes any compressed update transparently.

7. Hierarchical Aggregation for 200+ Nodes

For very large fleets, flat aggregation overwhelms the coordinator. The hierarchical_aggregation module groups workers into clusters with intermediate aggregators:

use heliosdb_federated_learning::hierarchical_aggregation;
// see module docs for cluster/region wiring

This is the same pattern Google uses for Gboard. The crate’s scalability_tests module includes 300+-node validation — see Section 11.

8. Byzantine-Robust + Secure Aggregation

When workers can’t be trusted (open consortium, edge devices, untrusted partners):

use heliosdb_federated_learning::secure_aggregation::{
    ByzantineRobustAggregator, MaliciousDetector,
};
use heliosdb_federated_learning::advanced_security::{
    AdvancedSecuritySystem, SecurityConfig,
};

let aggregator = ByzantineRobustAggregator::new(/* config */);
let detector = MaliciousDetector::new(/* config */);

Plus advanced_security adds backdoor detection and certified robustness. Combine with secure aggregation (secure_aggregation module) to keep individual updates encrypted from the coordinator itself.

9. Heavyweight Privacy: Homomorphic Encryption + ZKP + SMPC

When DP isn’t enough:

use heliosdb_federated_learning::homomorphic_encryption::HomomorphicEncryption;
use heliosdb_federated_learning::smpc::{SMPCConfig, ShamirSecretSharing};
use heliosdb_federated_learning::zkp::{ZKPConfig, ZKPSystem};

// HE: aggregate on encrypted updates
// SMPC: secret-share parameters across n participants
// ZKP: prove update validity without revealing it

These are heavy — orders of magnitude slower than DP — but they’re there when the threat model demands it. See docs/FEDERATED_LEARNING.md for the full crypto chapter.

Compliance modules are first-class:

use heliosdb_federated_learning::compliance_gdpr::GdprComplianceManager;
use heliosdb_federated_learning::compliance_hipaa::HipaaComplianceManager;

let gdpr = GdprComplianceManager::new(/* config */);
let report = gdpr.generate_compliance_report().await?;

let hipaa = HipaaComplianceManager::new(/* config */);
let phi_audit = hipaa.audit_phi_access().await?;

These produce the artefacts auditors actually want — consent records, data category logs, audit entries, exportable reports. Per the lib.rs comments, both modules landed in Week 11 of the build-out and cover the validation surface.

11. Use Cases (From Source)

Per docs/FEDERATED_LEARNING.md:

Multi-Institution Healthcare — train diagnostic models across hospitals without sharing patient data.
Financial Fraud Detection — collaborative across banks, customer privacy preserved.
IoT/Edge AI — train across edge devices with limited connectivity.
Cross-Organization Analytics — collaborative analytics maintaining competitive secrecy.
Regulatory Compliance — GDPR/HIPAA-compliant ML.

12. Inference & Serving

After training, expose the model via the built-in serving API:

use heliosdb_federated_learning::model_serving::{
    InferenceRequest, ModelServingEngine, ServingConfig,
};

let serving = ModelServingEngine::new(ServingConfig::default()).await?;
let response = serving.infer(InferenceRequest { /* ... */ }).await?;

Supports REST and gRPC; A/B testing for model versions is built in (ABTestConfig).

13. SQL Interface

A SQL surface is registered via sql_interface::SQLFunctionRegistry:

SELECT predict('fraud_detector', features)
FROM transactions
WHERE amount > 1000;

This dispatches into the FederatedLearningAPI for in-database inference.

Where Next

conversational-bi.md — natural language → SQL → results.
docs/FEDERATED_LEARNING.md — full design + API reference.
Sibling crates in heliosdb-federation/crates/: federated-query (cross-cloud query federation), sovereignty (data residency).