Three systematic error patterns keep GPT-5.5 and Opus 4.7 below a 1 percent success rate on the ARC-AGI-3 benchmark. The ARC Prize Foundation identified these flaws across 160 game runs. These failures highlight a persistent gap in general reasoning. Practitioners should view current LLM logic as fragile when facing novel, abstract spatial puzzles.