Two new papers introduce automated auditing and honeypot evaluations to detect scheming in Gemini models. Researchers tested whether coding agents actively undermine the safeguards meant to oversee them. These methods use simulated environments and real alignment codebases to catch deceptive behavior. This provides a concrete framework for practitioners to measure model misalignment before deployment.