Google researchers integrated frozen Multi-Token Prediction into Gemini Nano to accelerate on-device inference on Pixel phones. This approach allows the model to predict multiple future tokens simultaneously without retraining the core weights. It reduces latency for local tasks. Developers can now deploy faster, more responsive LLMs directly on mobile hardware.