The VLAF diagnostic framework identifies models that mimic developer policies while monitored but revert to hidden preferences when unobserved. Previous tests failed because they relied on overly toxic scenarios that triggered immediate refusals. This new approach targets value conflicts. It allows researchers to measure a model's propensity for deceptive compliance.