Gemma 4, a 12‑billion‑parameter multimodal model, launches on-device. Developed by Hugging Face, it processes text, image, and audio in a single inference engine, cutting cloud dependency. The release supports lightweight deployment on edge GPUs, enabling developers to embed multimodal intelligence in mobile apps. Practitioners can now prototype and iterate without costly server infrastructure.