The 12B parameter Gemma 4 processes text, images, and audio natively on devices with only 16 GB of RAM. It nearly matches the performance of the larger 26B version in benchmarks. Google DeepMind released the model under an Apache 2.0 license. This allows developers to deploy high-performance multimodal capabilities locally without expensive hardware.