WebSockets and connection-scoped caching now reduce API overhead within the Codex agent loop. This architecture minimizes the repetitive data exchange typical of iterative reasoning tasks. By streamlining the Responses API, OpenAI lowers model latency. Developers can now build more responsive autonomous systems that require frequent, rapid-fire interactions between the model and external tools.