Erdos Federated Computing — many collaborators, one model, no shared data

Why "Erdős"

Collaboration without sharing.

Paul Erdős published with more than 500 co-authors — his whole body of work was built on collaboration, immortalized by the Erdős number. Federated learning is collaboration of exactly that kind: many parties improve a shared model together, while their private data never leaves home. Erdos FC makes that pattern small enough to read and easy to extend.

"Many minds, one result — and nobody hands over their notebook."

Each round, sites train locally and send only model updates. The server averages them into a new global model. The raw data stays put.

Feature highlights

Everything in the federated loop — swappable.

Small abstract base classes for each part of the loop, with working defaults out of the box.

🧭

Controller / worker model

A server-side workflow orchestrates rounds; client-side executors do the local training. Clean separation, like production FL runtimes.

➗

Federated Averaging

Sample-weighted FedAvg aggregation that is dtype-safe for weights and integer buffers alike. Drop-in slot for FedProx/FedOpt next.

🔒

Privacy on the path

A Gaussian filter clips and noises each update before it leaves the site, so the server never sees raw local weights.

🧪

One-process simulator

Run a whole federation locally behind a single broadcast() contract — then deploy the same code on a real transport.

🔥

PyTorch "Client API"

Wrap an ordinary PyTorch training loop into a federated client with almost no changes — bring a model and a dataset.

🧩

Framework-agnostic core

The APIs, aggregator, and runtime import no deep-learning library, leaving room for TensorFlow / JAX executors beside PyTorch.

How it works

The round loop.

Scatter the global model → each site trains on private data → filter the update → gather & aggregate → repeat.

server.py

# every round, on the server
task   = Shareable(params=global_model)
out    = server.broadcast("train", task, ctx)

aggregator.reset(ctx)
for name, update in out:
    aggregator.accept(update, ctx)   # weighted by n_i
global_model = aggregator.aggregate(ctx).params

client.py

# on each client / site
class PTTrainer(Executor):
    def execute(self, task, shareable, ctx):
        model.load_state_dict(shareable.params)
        train(model, self.local_data)     # data stays here
        return Shareable(params=model.state_dict(),
                         meta={"num_samples": n})

Quickstart

Clone to a training federation in one command.

terminal

# 1) install
git clone https://github.com/sunnyinAI/ErdosFC.git
cd ErdosFC && pip install -e ".[torch]"

# 2) run the end-to-end FedAvg demo (4 simulated sites, 5 rounds)
cd examples/hello-pytorch-mnist
python run_simulation.py

# turn on the privacy filter
python run_simulation.py --dp-sigma 0.02

No internet? The demo falls back to a synthetic dataset automatically, so it always runs.

Roadmap

Where it's heading.

01gRPC transport behind the existing broadcast contract — same code, real network.
02More aggregators — FedProx, FedAdam / FedOpt, and variance-reduction corrections.
03Secure aggregation and a properly accounted differential-privacy path.
04Provisioning — TLS certificates and per-site startup kits for cross-organization runs.
05More executors — TensorFlow and JAX alongside PyTorch.
06Governed aggregation — a coordinator policy that applies a quality floor and an anti-capture cap: low-value updates are filtered out, and no single participant can dominate the shared model.
07Sovereign model artifacts — from one governed base model, each participant derives and owns a sovereign variant aligned to its own corpus and needs.

One model.
Many collaborators.
No shared data.