Gemini 3 Flash Unveiled: Surpassing Pro in Intelligence, Three Times Faster, and Free Globally

Fast! It's really fast!

Google dropped a bombshell at the end of 2025: Gemini 3 Flash! This model completely shatters the law that "fast means stupid and powerful means expensive." It achieves "zero - latency" responses at three times the speed of its predecessor and even outperforms its Pro - level sibling in programming and logical reasoning.

Gemini 3 Flash is officially released!

With this, the Gemini 3 family is now complete: Flash, Pro, and Deep Think.

The Flash model is now fully available on the Gemini APP, AI Studio, Google Antigravity, and Gemini CLI. When users open Gemini, the default version is Gemini 3 Flash, and it can be used for free!

If previous AI models were simulating human thinking, then Gemini 3 Flash is simulating human "intuition."

Three times the speed of Gemini 2.5 Pro, yet it has reasoning abilities that surpass the Pro - level model.

This is not just an upgrade; it's a game - changer for the existing AI interaction experience!

After testing Gemini 3 Flash, the only feeling is: fast! Incredibly fast.

It's so fast that there's "no loading bar." This experience is like a "zero - latency" magic. As soon as you hit enter, the answer is already rendered on the screen.

Not only is it outrageously fast, but even more terrifyingly, its intelligence directly "backstabs" its Pro sibling in some fields.

Normally, "Flash" means "dumbed - down," but this time it's different.

Gemini 3 Flash even directly surpasses Gemini 3 Pro in some complex Agentic Coding tasks!

For example, Flash achieved 81.2% in MMMU Pro (Multimodal Understanding and Reasoning), outperforming Gemini 3 Pro's 81.0%.

Considering the API cost, compared to Gemini 3 Pro, the cost of Flash is directly cut to a quarter.

It's cheaper, yet its performance is even better!

It's estimated that Google will keep the entire large - model industry "up all night" tonight.

Beating Claude and GPT

In the latest evaluation by Artificial Analysis, 3 Flash has made a qualitative leap compared to the previous - generation 2.5 Flash. This should be the biggest upgrade within the same series of models in 2025!

It's hard to imagine that a lightweight Flash model can outperform Claude's flagship model, Opus 4.5. (It's estimated that Anthropic is more restless than OpenAI.)

In other metrics, Flash has also reached the level of top - tier models.

Flash has demonstrated cutting - edge performance in doctoral - level reasoning and knowledge benchmark tests such as GPQA Diamond (90.4%) and Humanity’s Last Exam (33.7% without tools), comparable to larger cutting - edge models and significantly outperforming the previous flagship, Gemini 2.5 Pro, in multiple benchmark tests.

In the ARC - AGI Semi - Private Eval, Gemini 3 Flash also performs very competitively, and its cost is significantly lower than other cutting - edge models.

ARC - AGI - 1: 84.7%, $0.17/task

ARC - AGI - 2: 33.6%, $0.23/task

On LMArena, the text ability of Gemini 3 Flash has risen directly to the 3rd place!

Gemini 3 Flash proves that speed and scale don't have to come at the expense of intelligence.

In addition to its cutting - edge reasoning and multimodal abilities, Gemini 3 Flash is designed for efficiency, pushing the Pareto frontier between quality, cost, and speed.

When performing the highest - level thinking processes, Gemini 3 Flash can adjust its "thinking volume."

For more complex use cases, it may take longer to think, but under typical traffic, it consumes 30% less Tokens on average than 2.5 Pro and can accurately complete daily tasks with higher performance.

The core advantage of Gemini 3 Flash lies in its native speed!

It outperforms 2.5 Pro but is three times faster (based on the Artificial Analysis benchmark test), and its cost is only a fraction.

The pricing of Gemini 3 Flash is $0.50 per million input Tokens and $3 per million output Tokens (audio input remains at $1 per million input Tokens).

A New Favorite for Developers: The Perfect Balance of Speed and Depth

For developers, the response speed of the model is the top priority.

Gemini 3 Flash is designed for iterative development, offering the coding performance of Gemini 3 Pro and low latency - it can quickly reason and solve tasks in high - frequency workflows.

In the SWE - bench Verified benchmark test for evaluating the ability of coding agents, Gemini 3 Flash scored 78%, surpassing not only the 2.5 series but also Gemini 3 Pro.

It can be said that it achieves an ideal balance between agent coding, production - level systems, and responsive interactive applications.

Meanwhile, the strong performance of Gemini 3 Flash in reasoning, tool use, and multimodal abilities makes it very suitable for developers who want to conduct more complex video analysis, data extraction, and visual question - answering.

This means it can empower smarter applications that require both extremely fast responses and in - depth reasoning.

For example, Gemini 3 Flash can perform multimodal reasoning in a "pinball puzzle game" with hand - tracking, providing near - real - time AI assistance.

Or, it can build and A/B test new loading animation designs in near - real - time, simplifying the process from design to code.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Gemini 3 Flash is here with a lightning strike: Its intelligence surpasses that of Pro, it's three times faster, and it's free worldwide.

Beating Claude and GPT

A New Favorite for Developers: The Perfect Balance of Speed and Depth