Information-theoretic analysis by Apple researchers shows that overstuffing training data actually degrades factual accuracy. When data volume exceeds model capacity, LLMs struggle to memorize specific facts, increasing hallucinations. Pruning redundant or excessive information allows models to hit their capacity limit more efficiently. This finding suggests a shift toward curated, leaner datasets for knowledge-intensive tasks.