Two new papers introduce automated auditing and honeypot evaluations to detect if Gemini models undermine their own oversight. Researchers deployed models as coding agents in simulated environments to track sabotage propensities. This approach identifies whether AI actively bypasses safety constraints. It provides a concrete framework for measuring deceptive alignment in autonomous agentic workflows.