HomeArticle

Google and DeepMind still don't see eye to eye.

硅星人Pro2026-05-21 16:25
The old problem comes back again.

This year's Google I/O keynote speech was once again packed with an enormous amount of information, so much so that it was difficult for the outside world to grasp the key points.

Google has always been an important benchmark for Chinese AI companies to learn from. However, with the absence of the flagship top - level model at this year's I/O, the de facto delay and "catch - up" of Agent products, many AI practitioners were disappointed under the influence of a series of factors and were curious about what Google was actually up to.

In the past two days, I had the opportunity to communicate with key figures at Google, including DeepMind CTO Koray Kavukcuoglu, Google's Chief AI Scientist Jeff Dean, and Google CEO Sundar Pichai. Based on this first - hand information, I will try to understand Google's current situation.

Ultimately, I found that it all boils down to the way Google treats DeepMind.

What does Google think is the most important thing at the moment?

Obviously, the biggest consensus within Google at the moment is that it is the only true full - stack AI company. In various exchanges, the confidence this brings to it even exceeded my already high expectations. However, this also makes people think of Google's initial embarrassment after ChatGPT emerged - it was also full of confidence in Bard and the entire AI infrastructure, but problems arose due to over - confidence.

Which version of the story will unfold next? In my opinion, the most crucial factor lies in its resource allocation decisions. Initially, it bet on Bard, but later it was proven that allocating money and computing power to DeepMind was the right choice.

It's another critical juncture. Since full - stack means a huge competitive advantage, Google's current resource allocation decision is not to pour all the best resources into the most cutting - edge models, but to focus more on the most usable models.

It seems familiar.

So, it seems that the Flash series is not a compromise. It has the most important strategic significance and may occupy more available computing power resources. This is very different from the strategies of OpenAI and Anthropic.

In a demo that not many people noticed at the I/O keynote, the throughput speed of Gemini 3.5 flash performing tasks on TPU 8i was so fast that it was reminiscent of Groq, which used to only pursue speed regardless of other factors. Meanwhile, the feedback on the new model has been mixed. It seems that this is an inevitable result of this priority strategy and resource allocation.

As for the reason, it's actually what Google has been saying. Pichai said in a small - scale exchange that the demand they see is huge and constantly growing. If you think about the stories of the Internet and mobile phones in the past, you'll know what to do: quickly identify and meet the broadest needs beyond heavy users. Considering the popularity of agents, providing the most usable and suitable models rather than the "strongest" ones is Google's most important opportunity and task at the moment.

So, how will this decision affect the competitive landscape in the next few months?

I have a feeling that the "public opinion environment" Pichai faces is likely to become tense again for some time because of these decisions.

The Gap of DeepMind

Koray said he had worn the same T - shirt to talk to me for three consecutive years. The questions I asked him have always been consistent: in the first year, I asked him how to define so - called native multimodality; in the second year, I asked about the specific methods of multimodality and why Veo was so powerful; this year, I talked to him about how to evaluate this non - consensus choice in retrospect and what it has brought to Google.

As the most prominent spokesperson for DeepMind, he described this as one of Google's most correct decisions. Jeff Dean, who no longer appears on the main stage of I/O but still has significant internal influence, also believes that among all the releases this year, Omni is of great significance as it truly incorporates the capabilities accumulated by Gemini into videos.

In the model line led by DeepMind, the intelligence they define must be future - oriented and cannot simply be a one - dimensional extension of today's already prominent abilities such as language skills. However, there is still no "recipe" for how to train, and in essence, continuous experiments are needed.

And experiments mean resource consumption.

After multiple exchanges, I clearly felt the tension here. Google may even consider "restraint" when it sees that the capabilities of some modalities can exceed others.

Omni is also the only model at this year's I/O that was personally presented by DeepMind CEO, the great Demis Hassabis. However, there has been a lot of negative feedback after its release. An important reason is that the released version is a flash model.

Different from the strategy of releasing Gemini Flash first, Pichai defined Omni as a new model that is generationally ahead of existing models. Therefore, for safety and responsibility reasons, Omni Flash was released first.

Considering what Google thinks is the most important thing at the moment, this may once again lay a foreshadow.

The native multimodality pursued by DeepMind was initially a non - consensus technological route bet that faced significant internal pressure;

In the second stage, Veo and Nano Banana emerged from this route, which dispelled many doubts;

But now, in the third stage, it has less of the flavor of a technological route. Surprisingly, you'll find that it has been more internalized and absorbed by Google's business needs.

An interesting internal way of thinking also points to this. Google believes that another important return from this native multimodality lies in its help for Google's full - stack route of combining software and hardware. Multimodal capabilities can be used to accelerate hardware iteration, and AI has started to be used in this "internal cycle" rather than for DeepMind to create the next Nano Banana - level innovation.

Which is more important for Google, the most powerful model that DeepMind is capable of creating or the practical models that can currently improve revenue and user experience in actual businesses such as search?

This question has come back like a ghost, and it seems that Google and DeepMind are still not on the same page.

This article is from the WeChat official account "Silicon Star Pro", author: Wang Zhaoyang. Republished by 36Kr with permission.