A 4x increase in text generation speed defines DiffusionGemma, a new model from Google DeepMind. It replaces traditional autoregressive sampling with a diffusion-based approach to accelerate inference. This shift reduces latency for high-throughput tasks. Developers can now deploy faster generative workflows without sacrificing the quality of the original Gemma architecture.