Why has the Gemini experience become increasingly awkward? Good AI is getting more expensive the more you use it, and free AIs are collectively getting dumber.
Recently, if you only look at Google's promotional materials, you'll probably think that Gemini is almost invincible.
In terms of video generation, there's Omni; for image generation, there's Nano Banana. The performance of Gemini 3.5 Flash is even stronger than that of 3.1 Pro, and Gemini Spark can help you automatically complete tasks. From the press conference to the official blog, Gemini gives the impression of being a well - rounded warrior, making progress in almost every aspect.
Actually, a while ago, Lei Technology also gave high praise when reporting on Google I/O. But after actually using it for some time, I increasingly feel that Gemini 3.5 Flash is a bit disappointing.
(Image source: Google)
It's not that it performs poorly in benchmarks, nor is it at the bottom in terms of capabilities. On the contrary, many of its capabilities still belong to the top tier in the industry.
The problem is that when the various new features in the promotion are actually put into daily use, there's always an indescribable sense of awkwardness. You clearly know it's powerful, but it never feels that useful; you clearly know that many features are available, but you always feel like you're not really using them.
This sense of disconnection is not uncommon in the large - model circle recently: what manufacturers show is the upper limit of capabilities, while what users experience is the actual usage. The former is becoming more and more amazing, but the latter may not improve synchronously.
Gemini 3.5 Flash may be one of the most obvious examples of this contradiction. There are so many complaints that I have to voice them.
Quotas, routing, and capabilities: the experience is even more frustrating
Let's start with the most easily noticeable problem.
Quotas.
Google quietly modified the quota rules for membership subscriptions before the I/O 2026 conference, changing from a fixed number of messages to a compute - based quota.
To put it simply, in the past, Gemini only counted the number of interactions. The usage times of image, video, audio, and text large models were independent of each other and reset every 24 hours.
In practice, Pro members could generate 5 videos and 50 images a day, and they could never use up the text quota.
(Image source: Lei Technology)
After the modification, Google set both a weekly quota and a temporary quota that resets every five hours.
Now, the usage of all tasks is calculated based on Token consumption. If you make the model think more, even if the content it replies with remains the same, it will cost you more than before.
The question is, how can I know how much computing power a task will consume for the model?
(Image source: Lei Technology)
Moreover, all the previously categorized functions are now unified as part of this usage quota. Whether it's video, image, in - depth research, or Agent, once the quota for one function is used up, you won't be able to do anything for the next few hours.
In my own experience, generating a video with Omni Flash consumes about 1/3 of the Pro subscription quota. If you want to modify the video, it will use at least 1/2 of the Pro subscription quota. It's really not enough.
What affects the experience more than the quota is actually the routing problem.
This is not just my personal feeling. Many users have encountered similar situations recently. It was generating images normally before, but while chatting, Gemini suddenly says it can't generate images and tells you it's just a text model and can't handle such tasks.
(Image source: Lei Technology)
The most ridiculous thing is that there are even cases where only text is provided without an image.
(Image source: Lei Technology)
It's understandable if this happens occasionally, but if it happens frequently, users really can't figure out whether the function has failed or the model has switched incorrectly.
There are similar problems at the capability level.
Gemini 3.5 Flash always gives the impression that it can do things, but often doesn't do them stably. For the same math problem or reasoning problem, sometimes it gives a very good answer, but if you ask again a few hours later, the result may be completely different.
I've tested several classic logic problems. In many cases, its analysis process in the front is correct, and the reasoning chain seems complete, but in the last step, there are often some inexplicable mistakes. The most outrageous thing is that it's very confident, and its tone doesn't change even when the answer is wrong.
As for simpler calculation problems, it still makes mistakes.
(Image source: Lei Technology)
I know that this kind of problem doesn't matter much for chatting, but in learning, work, or even programming scenarios, the impact is completely different.
Does a good AI always get more and more expensive?
If the previous problems are at the experience level, the deeper problem actually comes from Google's recent product and pricing strategies.
In my opinion, the story Google likes to tell the most this year is about Agent.
From the press conference to the official promotion, almost all the focus is on Gemini Spark. Automatic data search, information organization, task execution, and even helping users complete cross - application operations. It sounds really futuristic and in line with everyone's imagination of an Agent.
The problem is that you need an Ultra subscription to use Gemini Spark. The starting price of this subscription is $99.99 per month, and the highest - level subscription is $199.99 per month (about 1352.98 yuan) for a limited time.
(Image source: Lei Technology)
You know, you only need to pay $20 per month to enable OpenAI and the invincible Codex.
So, there's a very interesting phenomenon: when people watch the press conference, they think Gemini is invincible, but when they open the product, the first thing they see is the upgrade button.
This gap actually affects the reputation more easily than the lack of functions. Because users know the capabilities exist and the effects are good, but they just can't access them.
As for the price of programming, it's not that cheap either.
You know, at the I/O 2026 conference, Google CEO Pichai emphasized the cost - advantage of Gemini 3.5 Flash a lot.
According to the official price, Gemini 3.5 Flash charges $1.5 per million input Tokens and $9 per million output Tokens. In comparison, the API pricing of Claude Opus 4.7 is $5 per million input Tokens, and GPT - 5.5 Pro charges $30 per million input Tokens.
(Image source: Lei Technology)
Just looking at the price list, it's indeed much cheaper, even giving a feeling of small profits but high turnover.
But the price list is just for reference. For those who actually use the model, what's more important is how much it costs to complete the same task.
Artificial Analysis did a statistics in the Agent evaluation. The cost of running a full set of tasks with Gemini 3.5 Flash exceeded $1500, while Gemini 3 Flash only cost less than $300. The gap is more than five times. Even compared with Gemini 3.1 Pro, the overall cost of Flash is much higher, even more expensive than GPT - 5.5.
(Image source: Lei Technology)
Where's the problem?
The answer is simple: it talks too much.
In the Agent test, Gemini 3.5 Flash needs nearly 50 rounds of dialogue on average to complete a task, while many competitors can finish in about 20 rounds. Don't underestimate this difference of dozens of rounds. Every time a new dialogue is started, the model has to read the previous chat history again. The more rounds there are, the faster the Tokens are consumed.
This is like taking a taxi. The price per kilometer is indeed cheap, but if you drive around the city three times, in the end, what users see is always the total price, not the starting price.
The new contradiction in AI: powerful at launch, but disappointing in use
After all, I don't think Gemini 3.5 Flash is a failed model.
In fact, it still belongs to the top tier in the industry. Its multimodal capabilities are still very strong, video generation is still good, and search and integration capabilities are still Google's forte. Many of its individual capabilities are still quite competitive in the entire industry.
The problem lies in the forcibly reduced usage quota and the frequent intelligence - degradation problems due to the shortage of computing power.
(Image source: Lei Technology)
No matter how Google promotes it, ordinary users don't care about the rankings or how much computing power Gemini 3.5 Flash saves. They care about whether they can complete tasks smoothly, whether the results are stable, whether they don't need to study complex rules, and whether they don't need to worry about when the quota will suddenly run out.
This is why more and more people are starting to miss some older - version models recently.
You know, about half a year ago, Google AI Studio used to give free users 50 interactions with the Pro model every day. It's really a pity.
For Gemini, the greatest hope in the future still lies in Agent.
After all, Google has the most complete ecological resources in the industry. As long as it can really connect search, email, calendar, documents, and the Android system in the future and let the Agent help users complete more real tasks, it still has a chance to establish an advantage that other manufacturers can't easily replicate.
But at this stage, my evaluation of Gemini 3.5 Flash obviously won't change.