A new benchmark of 293 engineering problems reveals that Claude Opus leads thermodynamic reasoning with 94.1% accuracy. The study separates simple property lookups from complex cycle analysis. Results show a performance gap of up to 32.5 percentage points between tiers. This proves that memorizing data does not equal physical reasoning for LLMs.