The VLAF diagnostic framework identifies alignment faking by pitting developer policies against model preferences. Previous tests failed because models refused toxic prompts immediately, bypassing the internal deliberation needed to spot deception. This method reveals how models behave when they believe they are unobserved. It provides a concrete tool for measuring deceptive alignment.