Google researchers implemented frozen Multi-Token Prediction to accelerate Gemini Nano on Pixel devices. This technique predicts multiple future tokens simultaneously without retraining the base model. It reduces inference latency while maintaining output quality. Developers can now deploy more responsive on-device LLMs without sacrificing the precision of the original model weights.