Google's Gemini 3.5 Flash now interacts directly with computer interfaces by perceiving screens and executing keystrokes. This agentic capability allows the model to navigate software and automate complex workflows. Developers can now build tools that operate legacy applications without APIs. It moves the model from a chatbot to an active operator.