StartseiteArtikel

OpenAI in jeopardy, DeepSeek pulls out a big move: Matching Google's best and taking on GPT-5 High

新智元2025-12-02 08:54
"Origin God", activate!

The "God of Open Source", DeepSeek, has officially released the V3.2 version, with its performance comprehensively surpassing GPT-5 High and on par with Google's Gemini-3.0 Pro. The new model has not only won 4 gold medals in international Olympiads but also broken the "impossible triangle" of "speed, cost, and intelligence" with its original DSA sparse attention architecture.

OpenAI should really be worried this time!

Just now, "The God of Source", DeepSeek, has open-sourced the official version of DeepSeek-V3.2 -

On multiple reasoning benchmarks such as mathematics and programming, it comprehensively surpasses GPT-5 High and is superior to Claude 4.5 Sonet;

Compared with the widely discussed Gemini 3.0 Pro, it is evenly matched!

Table 1: Scores of DeepSeek-V3.2 and other models on various evaluation sets in mathematics, code, and general fields (the estimated total number of consumed Tokens is in parentheses)

This year, DeepSeek has previously released 7 models - truly worthy of the title "God of Open Source":

DeepSeek‑R1, DeepSeek‑R1‑Zero

DeepSeek‑V3, DeepSeek‑V3.1, DeepSeek‑V3.1-Terminus, DeepSeek‑V3.2‑Exp

DeepSeek‑OCR, DeepSeek‑Math-V2

A Blockbuster from the Start: Open-Sourcing an AI with 4 Olympiad Gold Medal-Level Achievements

The brand-new model, DeepSeek-V3.2, is a real blockbuster from the start.

DeepSeek has officially released DeepSeek-V3.2 and DeepSeek-V3.2-Speciale - a reasoning-first model specifically designed for agents!

  • DeepSeek-V3.2: The official iterative version of V3.2-Exp, now available on the App, web version, and API;
  • DeepSeek-V3.2-Speciale: It breaks the boundary of reasoning ability and is currently only available through the API.

Both models achieve world-class reasoning performance:

  • V3.2: It balances reasoning ability and text length, has GPT-5 level performance, and is suitable for daily use;
  • V3.2-Speciale: It has extreme reasoning ability and has achieved 4 gold medal-level results; currently, only the API version is available (tool calls are not supported) to support community evaluation and research.

On mainstream reasoning benchmark tests, the performance of DeepSeek-V3.2-Speciale is comparable to that of Gemini-3.0-Pro (see Table 1).

Even more remarkable is that the V3.2-Speciale model has successfully won multiple gold medals:

  • IMO 2025 (International Mathematical Olympiad)
  • CMO 2025 (China Mathematical Olympiad)
  • ICPC World Finals 2025 (International Collegiate Programming Contest World Finals)
  • IOI 2025 (International Olympiad in Informatics)

Among them, the results of ICPC and IOI have reached the levels of the second and tenth human contestants respectively.

DeepSeek-V3.2 is the first model to directly integrate thinking into tool use and supports using tools in both thinking and non-thinking modes.

Currently, both models have been open-sourced:

· DeepSeek-V3.2

HuggingFace: https://huggingface.co/deepseek-ai/

DeepSeek-V3.2

ModelScope: https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.2

· DeepSeek-V3.2-Speciale

HuggingFace: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale

ModelScope: https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.2-Speciale

From "Engine Verification" to "All-Round Racer": The Evolution of DeepSeek V3.2

If the DeepSeek-V3.2-Exp released two months ago was a "concept car" roaring on the track, used to prove the power potential of the "sparse attention" engine to the world;

Then the officially launched DeepSeek V3.2 today is a "mass-produced supercar" that has completed interior refinement, is equipped with a top-level navigation system, and can hit the road at any time to solve complex problems.

This is the biggest evolutionary logic of DeepSeek V3.2 compared to the Exp version (experimental version): The core engine remains the same, but there is a qualitative change in driving skills (Agent ability).

V3.2 Official Version vs. Exp: Learning to "Think While Doing"

At the architectural level, V3.2 uses the DSA architecture successfully verified in the Exp version. However, in terms of "soft power", DeepSeek has solved a long-standing problem in the AI field - the disconnection between thinking and action.

During the V3.2-Exp period (and most other reasoning models), the model was like an old scholar with a poor memory: it would spend a long time thinking first and then decide to call a tool (such as searching for the weather).

But when the tool returned the result "It's rainy today", it often "blacked out" and forgot where it was in the thinking process and had to plan again.

The official version of V3.2 introduces "thinking context management".

This is like installing a "working memory buffer" for the model.

Now, V3.2 is like an experienced surgeon. While reaching for a scalpel (calling a tool), the surgical plan in its mind remains clear and coherent, and it can seamlessly proceed to the next step after getting the scalpel.

To master this skill, DeepSeek even built a "virtual training ground" for V3.2.

They synthesized more than 1,800 virtual operating systems, code libraries, and browser environments and generated 85,000 extremely tricky instructions, forcing V3.2 to repeatedly practice "fixing bugs", "searching for information", and "making reports" in the virtual world.

It is this high-intensity special training that has transformed the official version of V3.2 from a "test-taker" who can only solve problems into a "doer" who can skillfully use tools to solve real-world problems.

The Biggest Technological highlight: Equipping Attention with a "Lightning Indexer"

The reason why V3.2 can be both "smart" and "affordable" is still the underlying black technology called sparse attention (DSA).

The attention architecture of DeepSeek-V3.2

To understand its awesomeness, we need to first see how "stupid" traditional models are.

When dealing with long documents, traditional models are like a librarian with severe obsessive-compulsive disorder:

To answer a simple question, it forces itself to read every page and every line of every book in the library and calculate the associations between them.

This causes the computational complexity to explode exponentially with the length of the document (O(L^2)).

DSA equips this librarian with a set of "lightning indexers".

When a question comes, DSA first scans the "index" at a very low cost, instantly determines which pages of the book may contain the answer, and directly discards 99% of the irrelevant nonsense.

Then, it only conducts in-depth and detailed reading of the selected 1% of key content.

This strategy of "checking the table of contents" rather than "reading the whole book" reduces the computational complexity from the terrifying exponential level to almost linear (O(L)).

Significant Improvements: Breaking the "Impossible Triangle"

The successful implementation of DSA technology has directly broken the "impossible triangle" of "speed, cost, and intelligence" in the AI field.

First, the cost is halved, and long texts are no longer a problem.

For users, feeding a novel or code library with hundreds of thousands of words to the model is no longer a "money-burning" luxury, and the processing speed has changed from "the time to make a cup of coffee" to "the blink of an eye".

Second, the "emergence of intelligence" brought about by surplus computing power is the most exciting part.

Precisely because DSA saves a large amount of computing power, DeepSeek has the confidence to launch the terrifying Speciale version.

Since it can read fast, let it think longer!

The Speciale version uses the saved resources for more in-depth "long thinking" and logical deduction.

The result is astonishing: DeepSeek-V3.2-Speciale has not only surpassed GPT-5 High in hard-core indicators such as mathematics (IMO gold medal) and programming (IOI gold medal) but also tied with Google's most powerful Gemini 3.0 Pro.

From V3.2-Exp, which verified the potential of the DSA engine, to the official version of V3.2, which integrates Agent ability, thinking context management, and virtual training ground training, DeepSeek is demonstrating another path to strong intelligence: under the constraint of computing power, using a smarter architecture, more refined training, and a more open ecosystem to push the limits of reasoning.

The emergence of DeepSeek-V3.2 is the shining moment of DeepSeek's open-source AI: rejecting mindless Scaling with money and finding a shortcut to the peak through a smarter algorithm in the gaps of computing power.

Reference Materials

DeepSeek V3.2 Official Version: Strengthening Agent Ability and Incorporating Thinking and Reasoning

This article is from the WeChat official account "New Intelligence Yuan", author: Allen KingHZ. It is published by 36Kr with authorization.