Information-theoretic analysis reveals that LLM fact accuracy drops when training data exceeds model capacity. Researchers at Apple found that pruning redundant data prevents the model from saturating. This approach reduces hallucinations by optimizing the distribution of factual knowledge. Practitioners can now improve knowledge-intensive task performance by removing low-utility training samples.