Capability evaluations often accelerate the very research they aim to monitor. The AI Alignment Forum argues that focusing solely on performance benchmarks creates dangerous externalities. Shifting focus toward behavioral evaluations allows researchers to detect risks without inadvertently providing a roadmap for capability gains. This pivot helps safety practitioners identify hazardous tendencies before they scale.