Google researchers used frozen Multi-Token Prediction to accelerate Gemini Nano on Pixel devices. This method reduces the computational overhead of predicting multiple future tokens during inference. It maintains model accuracy while slashing latency. Developers can now deploy more responsive on-device LLMs without sacrificing the quality of generated text.