The long-awaited Gemini 3 has finally arrived, so powerful that it leads by a huge margin. Even Elon Musk and OpenAI have praised it.
According to a report by ZDONGXI on November 19th, early this morning, Google's most powerful inference model, Gemini 3, finally made its debut. This single model encompasses native multimodality, inference, and Agent capabilities.
The Google DeepMind research team stated that this is the world's most advanced multimodal understanding model and Google's most powerful Agent programming and context programming model. It can deliver richer visual effects and a more in - depth interactive experience, and is fully built on the most advanced inference technology.
This model is trained on Google's TPU, supports a context window of 1 million tokens, and is suitable for applications that require the following functions: Agent, advanced programming, long context, multimodal understanding, and algorithm development.
As soon as it was released, Gemini 3 almost dominated all benchmark tests. It ranked first in the LMArena large - model arena with a score of 1501 Elo.
Sam Altman, the co - founder and CEO of OpenAI, and Elon Musk, the founder and CEO of xAI, both sent "congratulatory messages" to Google. Altman tweeted that "Gemini 3 looks great," and Google CEO Sundar Pichai replied with an emoji.
Musk retweeted a tweet from Demis Hassabis, the CEO of Google DeepMind, saying "Well done."
Starting today, Google will deploy Gemini 3 on the following platforms:
All users of Gemini applications, as well as users who use Google AI Pro and Ultra subscription services in the AI mode of Search; Developers using the Gemini API, developers on Google's new Agent development platform Antigravity, and developers using the Gemini CLI; Enterprise users using the Vertex AI platform and Gemini Enterprise Edition.
In addition, Google will open the deep - thinking mode of Gemini 3 to Google AI Ultra subscribers in the coming weeks, and it is currently undergoing a security assessment.
Regarding the release of Gemini 3, Pichai believes that this model can turn any user's ideas into reality.
01.
Create interactive games and apps in minutes
And help you learn new knowledge
Let's first see what Gemini 3 Pro can do.
Gemini 3 can write visualization code for plasma flow in a tokamak device and create poems capturing the physical principles of nuclear fusion.
If users want to learn family traditional cooking, Gemini 3 can interpret and translate handwritten recipes in different languages and create a shareable family recipe.
If users want to learn a new topic, they can input academic papers, long - video lectures, or tutorials to Gemini 3. It can also generate interactive flashcards, visualizations, or code in other formats to help users master the content.
Gemini 3 can analyze a user's pickleball game video, identify areas for improvement, and generate a training plan for overall movement enhancement.
In the AI search mode, Gemini 3 can learn complex topic content. For example, it can learn complex knowledge points like the mechanism of RNA polymerase through the generative user interface in the AI mode of the search function. It's worth noting that this is also the first time Google has directly integrated a new model into the AI search function on the day of the model's release.
Gemini 3 can write a retro 3D spaceship game with a rich visual interface and interactivity.
This model can turn users' imaginations into reality by constructing, deconstructing, and recreating detailed 3D voxel art through code.
Gemini 3 can use shaders to create a playable sci - fi world.
It can also generate more practical and element - rich interactive web pages and apps.
02.
Dominating benchmark tests
Breaking the ceiling of large - model capabilities
Now let's look at the benchmark test results of Gemini 3 Pro.
Google's blog mentioned that Gemini 3 Pro was evaluated in a series of benchmark tests, including inference, multimodal capabilities, Agent tool use, multilingual performance, and long context. It far outperformed Gemini 2.5 Pro in all major AI benchmark tests and ranked first in the LMArena large - model arena with a score of 1501 Elo.
This model demonstrates doctorate - level inference ability. It achieved the highest scores in the "Ultimate Human Test" (scoring 37.5% without using any tools) and the GPQA Diamond - level test, and obtained a new top score of 23.4% in the MathArena Apex test.
In addition to text, Gemini 3 Pro achieved 81% on MMMU - Pro, 87.6% on Video - MMMU in multimodal inference, and also obtained the highest score of 72.1% on SimpleQA Verify.
This means that Gemini 3 Pro can solve complex problems covering a wide range of topics such as science and mathematics with a high degree of reliability.
The updated deep - thinking and multimodal understanding capabilities of Gemini 3 can help users solve more complex problems. In tests, Gemini 3 Deep Think outperformed Gemini 3 Pro in the "Ultimate Human Test" (41.0% without using tools) and GPQA Diamond (93.8%). It achieved a score of 45.1% on ARC - AGI - 2 (code execution, ARC award - certified), surpassing Google's previous models, as well as those of OpenAI and Anthropic.
In terms of programming ability, Gemini 3 is the best context programming and Agent programming model Google has ever built.
This model topped the WebDev arena leaderboard with a score of 1487 Elo. It scored 54.2% in the Terminal - Bench 2.0 test for model tool - use ability and far outperformed 2.5 Pro in the SWE - bench Verified benchmark test for measuring programming Agent capabilities.
Developers can use Gemini 3 for development in Google AI Studio, Vertex AI, Gemini CLI, and Google's new Agent development platform, Google Antigravity. It also supports third - party platforms such as Cursor, GitHub, JetBrains, Manus, Replit, etc.
Since Gemini 2, Google's Gemini models have made many advances in the Agent field. This time, Gemini 3 also topped the Vending - Bench 2 leaderboard. This benchmark test assesses the model's long - term planning ability by simulating the operation of a vending machine business. The results show that in a one - year simulated operation, Gemini 3 Pro consistently maintained stable tool use and decision - making coherence, neither deviating from the task goal nor achieving higher profits.
This means that Gemini 3 can help users complete daily tasks, such as booking local services or organizing the inbox.
03.
A new Agent development platform makes its debut
Realize end - to - end software development automation
Today, Google also released a new Agent development platform, Google Antigravity.
With the advanced inference, tool use, and Agent programming capabilities of Gemini 3, Google Antigravity transforms AI - assisted features from just a tool in the developer's toolkit into a proactive partner.
Although the core of Google Antigravity is still the AI integrated development environment (AI IDE) experience, its Agents have been upgraded to a dedicated interface and can directly access the editor, terminal, and browser. Now, these Agents can independently plan and synchronously execute complex end - to - end software tasks for developers while also validating their own code.
In addition to Gemini 3 Pro, Google Antigravity will also integrate Google's latest Gemini 2.5 Computer Use model and the image - editing model Nano Banana.
Google Antigravity used Gemini 3 to create an end - to - end Agent workflow for a flight - tracking application. This Agent can independently plan, write application code, and verify its execution effect through browser - based computer operations.
Finally, Google also mentioned that Gemini 3 is its safest model to date and has undergone the most comprehensive security assessment among Google's AI models. Model evaluation results show that it has less sycophantic behavior, stronger resistance to immediate injection, and improved protection against network attack abuse.
Nearly two years have passed since the release of the Gemini model in December 2023: Gemini 1 made breakthroughs in native multimodality and long - context windows, expanding the types and volume of information that can be processed; Gemini 2 can help users handle more complex tasks and ideas, and Gemini 2.5 Pro led the LMArena rankings for more than six months.
Now, the monthly active users of Google's AI Overviews search function based on the Gemini model have reached 2 billion, the monthly active users of Gemini applications exceed 650 million, more than 70% of cloud customers use Google's AI features, and 13 million developers have built works using its generative models.