The ARES framework targets systemic weaknesses where both the core model and the reward model fail to penalize unsafe behavior. It uses a "Safety Mentor" to compose adversarial prompts using specific personas and tactics. This end-to-end repair process closes critical alignment gaps that standard red-teaming misses, improving reliability for RLHF practitioners.