Capability evaluations often accelerate the very research they aim to monitor. A new proposal from the AI Alignment Forum argues for shifting focus toward model behaviors to better assess safety risks. This approach separates performance metrics from development cycles. Practitioners must now weigh whether benchmarking capabilities inadvertently creates a roadmap for riskier model iterations.