HomeArticle

Google unleashes a "price slasher": Gemini 3 Flash outperforms Pro, with only a quarter of the cost and lightning - fast speed.

智东西2025-12-18 11:03
The performance in multiple fields is comparable to that of the Pro model.

According to a report by Zhidx on December 18th, Google released Gemini 3 Flash last night, aiming to provide cutting - edge intelligence at a lower cost.

Specifically, the price per million tokens of its output is only 20% of Claude Sonnet 4.5 and 21% of GPT - 5.2, yet it can reach or even surpass the performance of these flagship models in benchmark tests.

Even when compared with Gemini 3 Pro, Gemini 3 Flash is highly cost - effective. The price of Flash is only 25% of Pro, but it outperforms the Pro version in core benchmark tests such as MMMU - Pro and SWE - bench Verified.

The previously released Gemini 3 series models have shown advantages in complex reasoning, multimodal and visual understanding, agent, and Vibe Coding tasks. Gemini 3 Flash retains this foundation, combining the reasoning ability of Gemini 3 Pro with the latency, efficiency, and cost of the Flash level.

Jeff Dean, Google's Chief Scientist, said that Gemini 3 Flash not only has higher quality than 2.5 Pro but also is three times faster, and its price is only a fraction of the latter. Here is a side - by - side demonstration:

Gemini 3 Flash is now fully available. Developers can use it through the Gemini API in Google AI Studio, Gemini CLI, and the agent development platform Google Antigravity. Ordinary users can use it through the Gemini application and the AI mode in Google Search.

01. Built for Iterative Development and Enables "Programming by Voice"

What exactly can Gemini 3 Flash do? Google says it is a model built specifically for iterative development, capable of providing programming performance close to that of Gemini 3 Pro with low latency.

Google shared several cases. For example, Gemini 3 Flash can perform multimodal reasoning in a hand - tracking "pinball puzzle game" and provide near - real - time AI assistance.

It can also build and conduct A/B tests on new loading animation designs in near - real - time, simplifying the process from design to code.

Using multimodal reasoning, Gemini 3 Flash can quickly analyze images with contextual UI overlays, generate subtitles, and ultimately transform static images into interactive experiences.

With its excellent performance in reasoning, tool use, and multimodal capabilities, Gemini 3 Flash is particularly suitable for developers who want to conduct more complex video analysis, data extraction, and visual Q&A.

The multimodal reasoning ability of Gemini 3 Flash can be used to help users see, hear, and understand any type of information. Users can ask Gemini to understand videos and images and transform the content into a helpful and actionable plan within seconds.

Gemini 3 Flash in the Gemini application can analyze short - video content and give you a plan, such as how to improve your golf swing.

Since Gemini 3 Flash is optimized for speed, it can "see" and guess what you are drawing while you are still drawing.

You can upload a recording, and Gemini 3 Flash will identify your knowledge gaps, create a customized quiz, and provide detailed explanations for the answers.

Alternatively, you can try "programming by voice" and build interesting and useful applications from scratch using only voice input. Gemini 3 Flash can transform unstructured ideas into a functional application within minutes.

02. Outperforms Pro - level Models in Multiple Domains and Can Automatically Adjust Thinking Volume

How does Gemini 3 Flash perform in benchmark tests? In doctoral - level reasoning and knowledge benchmark tests such as GPQA Diamond (90.4%) and Humanity's Last Exam (33.7% without using tools), it can rival larger cutting - edge models and significantly outperforms Gemini 2.5 Pro in multiple benchmark tests.

In the SWE - bench Verified benchmark test that evaluates the capabilities of coding agents, Gemini 3 Flash achieved a score of 78%, surpassing not only the 2.5 series but also Gemini 3 Pro.

It also achieved a score of 81.2% on MMMU Pro, comparable to Gemini 3 Pro, achieving state - of - the - art performance.

In the benchmark tests shown in the figure below, Gemini 3 Flash outperformed models such as Claude Sonnet 4.5 and Gemini 2.5 Pro in almost all benchmark tests.

In addition to cutting - edge reasoning and multimodal capabilities, Gemini 3 Flash is built for high efficiency, pushing the Pareto frontier between quality, cost, and speed. The scatter plot below shows the relationship between the LMArena Elo scores of multiple language models and the price per million tokens, with a line marking the Pareto frontier passing through Gemini 3 Pro, Gemini 3 Flash, and Gemini 3 Flash Lite.

When the thinking budget is maximized, Gemini 3 Flash can adjust its thinking volume. For more complex use cases, it may think for a longer time. However, according to typical traffic measurements, on the premise of accurately completing daily tasks with higher performance, it uses 30% fewer tokens on average than 2.5 Pro.

03. Conclusion: Completing the Gemini 3 Model Landscape and Expected to be Deeply Integrated into Daily Applications

The Gemini 3 series of models have been well - received since their release, but their high cost has deterred many users. Gemini 3 Flash completes the layout of the Gemini 3 family in terms of lightweight and high cost - effectiveness, meeting the requirements of developers in real - world production environments.

From iterative development, Vibe Coding, to multimodal applications, real - time interaction, and agent systems, the higher cost - effectiveness demonstrated by Gemini 3 Flash is expected to help integrate intelligence more widely into daily applications and business systems.

This article is from the WeChat public account "Zhidx" (ID: zhidxcom). Author: Chen Junda, Editor: Li Shuiqing. Republished by 36Kr with permission.