Training data that exceeds a model's capacity limit degrades factual accuracy. Researchers at Apple found that pruning training sets using an information-theoretic approach improves a model's ability to memorize facts. This suggests that less, higher-quality data prevents hallucinations. Practitioners can optimize knowledge retention by aligning data volume with parameter limits.