The ARES framework targets systemic weaknesses where both the LLM and its reward model fail to penalize unsafe behavior. It uses a "Safety Mentor" to compose adversarial prompts across specific personas and tactics. This end-to-end repair process closes critical alignment gaps. Practitioners can now mitigate dual-failure points that standard red-teaming misses.