A study of 34,000 real-world skills reveals that modular instructions barely improve agent performance under realistic conditions. While benchmarks suggest success, weaker models actually perform worse when these skills are active. This gap exposes a critical failure in how researchers evaluate agentic capabilities. Practitioners should prioritize core model reasoning over modular skill sets.