The RealChart2Code benchmark tested 14 leading AI models on complex visualizations from real-world datasets. Top proprietary models lost nearly half their performance when charts grew in complexity. This gap reveals a critical failure in spatial reasoning. Practitioners should verify AI-generated data visualizations manually to avoid costly interpretation errors in production environments.