Identical datasets with different country labels triggered detailed stereotypes in Microsoft Copilot rather than accurate analysis. Mathematician Adam Kucharski demonstrated that default settings often fail basic data integrity tests. Users must manually switch to reasoning models to catch these biases. This highlights a critical failure in default model selection for data tasks.