A Gemma 2B specialist judge outperformed Sonnet 4.5 in detecting misalignment within out-of-domain safety prompts. The model distinguished between secure and insecure fine-tuning where the larger model failed. This suggests narrow, specialized judges offer better transparency and lower costs for audits. Practitioners can now use smaller models for targeted alignment verification.