Decoupled DiLoCo allows distributed AI training across networks with high latency and unstable connections. Google DeepMind researchers decoupled the model's weight updates from the communication cycle to prevent synchronization bottlenecks. This approach enables training across disparate clusters without sacrificing convergence. Practitioners can now leverage geographically distributed hardware for large-scale model optimization.