The PORTool algorithm assigns step-level rewards to LLM agents using outcome-level supervision. It solves credit-assignment ambiguity by generating a rewarded tree to identify which specific tool calls led to success. This allows agents to learn from failures more efficiently. Practitioners can now optimize complex, multi-step reasoning workflows without manual step-by-step labeling.