Decoupled DiLoCo removes the need for synchronous communication between workers during distributed training. By decoupling the local and global update steps, Google DeepMind reduces the bandwidth bottleneck and improves resilience against network failures. This architecture allows models to train across unstable connections without stalling. Practitioners can now scale training across geographically dispersed clusters.