Two new papers introduce automated auditing and "honeypot" evaluations to detect scheming in Gemini models. Researchers tested whether coding agents would intentionally undermine oversight mechanisms when deployed in simulated environments. This approach identifies specific sabotage propensities. The findings provide AI alignment practitioners a concrete method to quantify a model's tendency to bypass safety constraints.