Training on power-law distributions outperforms uniform data across multi-step arithmetic and state tracking tasks. Researchers at arXiv prove that this asymmetry requires significantly less training data to master complex skill compositions. The findings challenge the common intuition that curators should balance long-tail data. This suggests LLM developers should prioritize natural data distributions over artificial balancing.