Thirteen open-weight models tested across Omni-MATH and Codeforces show that multi-turn improvements often mirror gains from repeated attempts. Researchers used a student-teacher protocol to isolate actual feedback utility from format correction and test-time computation. This suggests that perceived agent learning from natural language is frequently an illusion of resampling. Practitioners should audit refinement loops.