The term distillation attacks describes the process of using a larger model's outputs to train a smaller one. Nathan Lambert argues this phrasing mischaracterizes the technical reality of model distillation. This semantic conflict obscures the actual mechanics of knowledge transfer. Researchers must distinguish between intentional model compression and adversarial attempts to steal proprietary weights.