The "Nano Banana" on LMArena received 5 million votes in two weeks, triggering a tenfold increase in traffic. Google and OpenAI are competing head - to - head.
In August, nano - banana topped the text - to - image list on LMArena, driving a tenfold surge in the traffic of the LMArena community. The monthly active users reached over 3 million. After nano - banana started the blind test on LMArena, it attracted over 5 million total votes in just two weeks and won over 2.5 million direct votes alone, setting a record for the highest participation ever. Since its launch in 2023, LMArena has become an arena where tech giants like Google and OpenAI compete fiercely.
In August this year, a mysterious AI image editor named "Nano Banana" easily topped the Image Edit Arena list, directly skyrocketing the platform traffic on LMArena in August:
The traffic increased tenfold, and the monthly active users reached over 3 million.
Since the model started the blind test on LMArena, it has attracted over 5 million total votes in just two weeks and won over 2.5 million direct votes alone, setting a record for the highest participation ever.
The mysterious identity of nano - banana has also sparked extensive speculation in the LMArena community.
Before Google claimed "Nano Banana" and officially named it Gemini 2.5 Flash Image, many netizens had guessed that Google was the real owner of Nano Banana.
Some netizens also posted the method of using the genuine "Nano Banana" on LMArena. This method is not only free but also does not require login.
Not only can users "get close" to various latest models, but LMArena also provides a real "Colosseum" for the competition of large models. It allows the latest models of companies like Google and OpenAI to compete head - to - head here and be reviewed by thousands of users.
The votes and feedback from users determine the rankings of these large models and provide real - world use - case data for large - model manufacturers to iterate their models, enabling them to improve their models more targeted.
The popularity of nano - banana has skyrocketed the traffic on LMArena tenfold. According to Wei - Lin Chiang, the Chief Technology Officer of LMArena, the monthly active users of the site have exceeded 3 million.
Both Google and LMArena have become the biggest winners in this traffic feast.
From Chatbot Arena to LMArena
Wei - Lin Chiang and Anastasios Angelopoulos, co - founders of LMArena
LMArena's predecessor was called Chatbot Arena, which originally originated from a research project at Berkeley in 2023 and was later renamed LMArena.
Chatbot Arena is like a user - community evaluation center. It changed the traditional way of evaluating AI technology through subject tests and handed the evaluation power to community users. It uses anonymous, crowdsourced pairwise comparisons to evaluate large models.
Users can also choose models for self - testing.
The release of large models like ChatGPT and Llama 1 provided an opportunity for the emergence of Chatbot Arena.
At that time, people didn't have an effective way to evaluate large models. So Chiang, along with Berkeley researcher Anastasios Angelopoulos and Ion Stoica, jointly founded Chatbot Arena, which later became LMArena.
Their idea was to create an open, web - based platform centered around the community and invite everyone to participate in the evaluation.
Soon, Chatbot Arena attracted a lot of attention. Thousands of people came to vote, and they used the voting data from these users to compile the first version of the ranking list.
Most of the models on the initial list were open - source models. The only commercial models were Claude and GPT.
As more models continued to join, Chatbot Arena gained more and more attention. AI giants requested to rank their products and tried to top this ranking list.
The popularity of Chatbot Arena also made many technology companies regard it as a barometer of AI technology. They closely monitored the changes in the Chatbot Arena list just like Wall Street traders watch the stock market.
All this surprised Joseph Spisak, the Director of Product Management at Meta AI. He was amazed that a few students could have such a significant impact.
Chiang hopes that LMArena can become a platform accessible to everyone. He hopes that more users will test these models, express their opinions and preferences, and help the community and model providers better evaluate AI based on these real - world use cases.
As Chiang said, in the LMArena community, the most popular and fastest - growing models often come from real - world use cases. "Nano Banana" is one of the most successful examples.
Its anonymous debut and the blind - test mechanism made nano - banana naturally popular on LMArena. At that time, ordinary users couldn't manually select nano - banana. They could only encounter it randomly in battles. There were many posts in the community discussing the experience of "waiting for Banana after many battles".
Currently, Gemini 2.5 Flash Image has become the "double champion" on LMArena, ranking first on both the Image Edit Arena and Text - to - Image lists.
From the LMArena rankings, we can also see the best - performing models in various fields.
For example, in the coding field, Claude ranks the best. In the creative field, Gemini ranks among the top.
Maybe due to the internal adjustment of Meta's AI team, Chiang hasn't heard much about Llama 4. But he believes that the "full - model" Meta is building may represent a major trend in the future industry.
Why do large - model manufacturers love to "dominate the list"?
Why do large - model manufacturers like OpenAI, Google, and Anthropic all like to put their models on ranking lists like LMArena?
Is it to build brand exposure or to get user feedback to improve their models?
Obviously, exposure and endorsement are the most intuitive short - term effects.
LMArena is one of the most - watched public lists in the industry, with millions of cumulative votes. Technology media also likes to frequently quote LMArena's data, which can bring significant word - of - mouth and traffic dividends to large - model brands.
Secondly, it's the user feedback closer to "real - world use".
LMArena uses an anonymous, random - pairing voting method and the Elo scoring system. This reduces subjective influences such as "brand halo" and "position bias" and can truly reflect users' evaluations of the quality of model responses.
The Elo system was originally used for scoring in chess and is also the core mechanism behind the LMArena ranking list. Under this rule, each player (or model) has a strength score (Elo score). After each battle, the Elo scores of both sides are updated according to the result and the expectation.
This makes each user vote a battle. After thousands of battles, the Elo scores of models converge, and the ranking can more truly reflect user preferences.
In addition, LMArena provides a stage for cross - manufacturer, cross - open - source/closed - source competition. This naturally brings higher - traffic exposure and provides users with more diverse selection information.
As Chiang said, he hopes to build LMArena into an open space where everyone can participate and express their opinions.
Everything here is driven by the community mechanism, encouraging everyone to ask questions, vote, and express their evaluations of different models.
For large - model manufacturers, LMArena provides a good opportunity to "look in the mirror".
Large - model manufacturers can see their rankings in their respective fields, obtain reports and analyses provided by LMArena based on community feedback, and comprehensively evaluate the performance of their models to improve model capabilities in a targeted manner.
Do we need new LLM benchmark tests?
When all models are very close to the benchmark tests, do we still need new benchmark tests?
Chiang believes this is very necessary. One of the core principles is that these benchmarks should be rooted in real - world use cases.
For example, we can go beyond traditional benchmark tests and move towards benchmark tests closer to real - user scenarios, especially those driven by professionals who are good at using AI tools to complete tasks.
Take the newly launched WebDev benchmark test on LMArena as an example. Users can use prompts to let a model build a website. This kind of benchmark test can better connect AI technology with real - world use cases and make it land in practical application scenarios faster.
In response to the MIT report that "most companies investing in AI haven't seen a return on investment", Chiang thinks it's an interesting study.
He believes that this study reflects that "it's particularly important to closely connect AI with real - world use cases", which is also the reason why he wants to expand the LMArena platform to more industries.
It is hoped that through more benchmark tests rooted in real - world use cases, the gap between technology and practical scenarios can be bridged, and measurable standards can be provided.
Chiang said that the goal of LMArena is to use platform data to understand the limitations of models, keep the data - research process transparent, and publish the data to promote the continuous construction of the community platform.
For large - model manufacturers and "user audiences", this is an arena that never closes.
Reference materials:
https://www.businessinsider.com/lmarena-cto-compare-ai-models-google-nano-banana-2025-9
This article is from the WeChat official account "New Intelligence Yuan". Author: New Intelligence Yuan. Editor: Yuan Yu. Republished by 36Kr with authorization.