Granular audio tags now allow developers to direct AI speech with precise expressive control. Google DeepMind integrated these controls into the Gemini 3.1 Flash model to improve vocal nuance. This update moves beyond generic text-to-speech. Practitioners can now dictate specific emotional tones and pacing for more natural synthetic voice interactions.