Capability evaluations often accelerate the very research they intend to monitor. The AI Alignment Forum argues that focusing on how models behave is more critical for safety than measuring raw skill. This shift helps researchers forecast risks without inadvertently speeding up development. Practitioners should prioritize behavioral monitoring to avoid dangerous externalities in model training.