Fifty-three tracked predictions show that AI capability benchmarks are slipping, with SWE-bench scores trailing behind forecasts. While compute scaleups lag, safety and governance risks are accelerating. Anthropic reported that Claude Mythos Preview discovered thousands of autonomous zero-day vulnerabilities. This divergence suggests that security threats are outpacing the actual utility of current models.