Training on power-law distributions outperforms uniform data in multi-step arithmetic and state tracking tasks. Researchers at arXiv prove that this asymmetry requires significantly less training data to master long-tail skills. The findings challenge the common intuition that data curation should target uniformity. Practitioners should reconsider data balancing strategies to improve compositional reasoning.