Google researchers implemented frozen Multi-Token Prediction to accelerate Gemini Nano on Pixel devices. This technique reduces computational overhead during inference by leveraging pre-trained prediction heads without updating weights. The approach maintains model accuracy while slashing latency. Developers can now deploy more responsive on-device LLMs without sacrificing performance or increasing power consumption.