Four times faster text generation defines DiffusionGemma, a new approach from Google DeepMind. It replaces traditional autoregressive sampling with a diffusion-based process to accelerate inference. This method reduces the computational overhead of token prediction. Researchers can now generate high-quality text sequences with significantly lower latency than standard LLM architectures.