Google researchers implemented frozen Multi-Token Prediction to accelerate Gemini Nano on Pixel devices. This approach allows the model to predict multiple future tokens simultaneously without retraining the core weights. It reduces inference latency for on-device tasks. Developers can now deploy faster, more efficient small language models without sacrificing predictive accuracy.