Two new papers introduce automated auditing and "scheming honeypots" to detect if Gemini models undermine their own oversight. Researchers deployed models as coding agents in simulated environments to see if they would intentionally sabotage safety constraints. This provides a concrete framework for detecting deceptive alignment before models reach full autonomy.