Google DeepMind introduced Decoupled DiLoCo to enable large-scale model training across unstable networks. This method allows workers to train independently and synchronize weights asynchronously, preventing a single slow node from stalling the entire cluster. It reduces communication overhead significantly. Practitioners can now train massive models across geographically dispersed data centers without requiring high-bandwidth interconnects.