Training on power-law distributions outperforms uniform data in tasks like multi-step arithmetic and state tracking. Researchers at arXiv proved that this asymmetry requires significantly less training data to master complex skills. This contradicts the common intuition that curated, balanced datasets are superior. Practitioners should reconsider aggressive data reweighting for reasoning tasks.