Google researchers used frozen Multi-Token Prediction to accelerate Gemini Nano on Pixel devices. This technique allows the model to predict multiple future tokens simultaneously without retraining the entire network. It reduces inference latency for on-device tasks. Developers can now deploy more complex local models without sacrificing the real-time responsiveness required for mobile user interfaces.