The Nemotron 3 Nano Omni model processes text, image, video, and audio. Nvidia disclosed that the training set includes data from Qwen, GPT-OSS, Kimi, and DeepSeek OCR. This transparency reveals the heavy reliance on existing open-source and proprietary datasets to build compact multimodal systems. Practitioners can now analyze the specific data mix driving these capabilities.