Training on power-law distributions outperforms uniform data across multi-step arithmetic and state tracking tasks. Researchers at arXiv prove that this asymmetry requires significantly less training data to master compositional skills. The finding challenges the common intuition that curators should balance long-tail data. Practitioners should reconsider data reweighting strategies for complex reasoning.