CNN, NBC, and USA Today are fighting to block their content from a web archive used for chatbot training. These publishers aim to curb unauthorized data harvesting by AI companies. The move highlights a growing legal conflict over copyright. This creates a tighter bottleneck for high-quality training data available to model developers.