Nvidia released Nemotron 3 Nano Omni, an open multimodal model supporting text, image, video, and audio. The training set utilizes data from Qwen, GPT-OSS, Kimi, and DeepSeek OCR. This transparency reveals the reliance on existing open-source datasets for modern multimodal training. Developers can now analyze the specific data mix driving these capabilities.