Google's Gemini 3.1 Ultra ships a 2M-token context window
Google has launched Gemini 3.1 Ultra, a major upgrade to its flagship model that pushes hard on context length and multimodality. The standout feature is a 2-million-token context window that works natively across text, images, audio and video — enough to hold very large documents, long conversations or entire media files in a single session.
The release also adds a sandboxed Code Execution tool, letting the model write, run and test code in the middle of a conversation rather than just suggesting it. That moves Gemini further toward the agentic pattern now common across AI tools, where the model doesn't only answer questions but actually carries out multi-step tasks.
Reliability got attention too. Google says Gemini 3.1 Ultra has significantly improved grounding, which is meant to reduce hallucinations on factual queries by tying answers more closely to verifiable sources. Alongside the Ultra model, Google made its Imagen 3 Nano and Pro image models widely available, including the ability to use a video file as a prompt to generate context-aware images such as thumbnails and infographics.
The pattern across June 2026 is consistent: longer context, native handling of every media type, and built-in tools that let models act, not just talk. For teams comparing AI tools, the practical question is shifting from "which model is smartest" to "which one can reliably take a long, messy task and see it through."