Amazon engineers are distilling Anthropic models into smaller, cheaper versions for internal use. This move precedes a shift to token-based pricing next year, which threatens to spike operational expenses. The company is also vetting OpenAI as a potential alternative. This shift highlights the urgent need for efficient model compression in enterprise deployments.