A frozen Multi-Token Prediction head now accelerates Gemini Nano on Pixel devices. This technique allows the model to predict multiple future tokens simultaneously, reducing the total number of forward passes required. It improves on-device inference speed without sacrificing accuracy. Developers can now deploy more responsive local LLM features with lower computational overhead.