Analysis of 25,000 agent runs reveals that LLM-based scientific systems often produce results without following rigorous epistemic norms. The base model determines 41.4% of performance and behavior, outweighing the agent scaffold. This suggests that current scientific agents lack the self-correcting reasoning necessary for genuine inquiry. Practitioners must prioritize model logic over workflow architecture.