Capability evaluations often accelerate the very research they intend to monitor. The AI Alignment Forum argues that focusing on model behaviors provides a safer alternative to measuring raw performance. This shift prevents evaluators from inadvertently providing a roadmap for capability gains. Practitioners should prioritize behavioral monitoring to better forecast and mitigate emergent risks.