Current AI evaluations prioritize capabilities like coding and scientific reasoning. This approach accelerates research but risks ignoring the specific behavioral patterns that lead to safety failures. The AI Alignment Forum argues that tracking behaviors provides a more reliable forecast for risk. Practitioners must pivot toward behavioral monitoring to predict systemic instability.