Large language models restructure complex codebases in hours yet struggle with basic everyday questions. This gap suggests a fundamental limit in how LLMs process formal logic versus casual human intuition. The disparity reveals that high performance in math does not equal general intelligence. Practitioners should expect persistent failures in non-structured reasoning tasks.