Training on power-law distributions outperforms uniform data in compositional reasoning tasks like multi-step arithmetic. Researchers found that this asymmetry allows models to learn complex skills with significantly less data. The study challenges the common intuition that curators should reweight long-tail knowledge. This suggests arXiv:2604.22951's findings may change how developers curate training sets.