The RAMP strategy integrates deep reinforcement learning with automated planning to learn numeric action models through direct environment interaction. It replaces the need for expert offline traces by using a feedback loop between policy training and model refinement. This allows agents to plan future actions in numeric domains without pre-existing expert data.