Training data that exceeds a model's capacity actually degrades factual accuracy. Researchers at Apple found that pruning training sets improves a model's ability to memorize specific facts. This information-theoretic approach reduces hallucinations by optimizing data distributions. Practitioners can now improve knowledge retrieval by removing redundant or excessive data rather than simply increasing dataset size.