How it works

This first LydianAI app is a federated/distributed training PoC: a FastAPI coordinator runs FedAvg rounds while workers (CPU-only or GPU) train locally and submit model updates.

Server / Coordinator (macOS CPU OK)

Hosts a FastAPI API, shards CIFAR-10 across workers, aggregates updates using FedAvg, and tracks metrics per round.

Workers (Ubuntu GPU/CPU)

Register → poll → download model → train locally → submit update → repeat. Workers can be modern GPUs, legacy Pascal GPUs, or CPU-only.

Key design goals

Heterogeneous hardware

Mixed compute is the default: different GPUs, different speeds, and even CPU-only machines.

Legacy GPU support

Pascal GPUs (sm_61) require older Torch wheels and often older Python. The PoC supports a LEGACY install path and a legacy Torch mode.

Simple networking

All machines join the same Tailscale tailnet. Workers connect to the server using stable 100.x IPs.

What v1 is (and isn’t)

It is

FedAvg training rounds
FastAPI coordinator + worker loop
CLI client to start/monitor/results
NEW vs LEGACY GPU install paths

It isn’t

A production scheduler
A hosted “managed service”
A full marketplace / multi-tenant platform