A specialized reviewer agent now evaluates tool-calling trajectories during the execution loop. This Apple research moves assessment from post-hoc analysis to inference-time feedback. It allows agents to correct parameter errors and tool selection mistakes immediately. Practitioners can now implement dynamic self-correction without relying on slow prompt-tuning or expensive retraining cycles.