Capability evaluations often inadvertently accelerate the development of risky AI features. The AI Alignment Forum argues that safety researchers must shift focus toward evaluating specific model behaviors. This pivot helps forecast risks without providing a roadmap for increasing model power. Practitioners should prioritize behavioral monitoring to avoid fueling capability races.