Domestic AI Programming Ranks Second Globally: Which Is the Ultimate Vibe Coding Tool After Testing Five Major Models?

Which domestic model is the best? Look to... in Zhejiang, China.

Surpassing GPT-5.5, Gemini 3.5 Flash, and DeepSeek V4 Pro, Alibaba's latest flagship model, Qwen3.7 Max, has secured the second place on the programming competition leaderboard, only trailing behind Claude Opus 4.7.

Screenshot of the May 26th leaderboard

In addition to user choices in real-world scenarios, on traditional fixed evaluation leaderboards for large models, such as the Terminal Bench for terminal capabilities and the SWE Bench for programming capabilities, Qwen3.7 Max has also won the championship among domestic models.

Although it's been four years since the emergence of large models and we're quite used to the frequent updates of these leaderboards, we still can't help but want to experience how the Qwen model, which can outperform GPT 5.5, actually performs.

You know, the currently most popular Coding Agent combination is probably Codex paired with GPT 5.5.

If we change the default model in Codex to Qwen3.7 Max and then use Codex to complete some daily tasks, will it be even better than GPT 5.5?

Obtain Qwen3.7 Max

Taking advantage of the token discount promotions launched by various companies, Alibaba Cloud is also offering 1 million free tokens for use on the Alibaba Cloud Bailian platform.

The pricing of Qwen3.7 Max on the Alibaba Cloud official website is currently at a 50% discount for a limited time. The input costs 6 yuan per million tokens, and the output costs 18 yuan per million tokens. New users can also participate in a 50% discount recharge savings plan, getting 20 yuan worth of tokens for 10 yuan per month. The standard Token Plan currently costs 198 yuan per month.

Overall, according to the data displayed on the large model aggregation platform OpenRouter, the price of Qwen3.7 Max is in the middle range. It can't compare with DeepSeek's rock-bottom prices, but it's still much more affordable than Opus 4.7 and GPT 5.5.

We directly recharged 20 yuan for the "Preferred Entry-Level" package, which is applicable to all models. However, it should be noted here that the 50% discount only applies to one package. That is, if you purchase the 10-yuan package, you can't buy the 50-yuan or 250-yuan half-price discount plans.

Test DeepSeek, Claude, GPT, Gemini, and Qwen Together

After obtaining the API Key and one million free tokens, we first used Qwen3.7 Max on the Alibaba Cloud Bailian platform and the Qianwen official website to do some common front-end web design to test its development capabilities.

For a physical simulation test where the differences can be intuitively seen, we used a simple prompt: "Use HTML + CSS + JS to create an animation simulating liquid sloshing in a container. Dragging the container can change the tilt angle."

Generated by Qwen3.7-Max on the Qianwen official website

Qwen3.7 Max successfully completed this simulation challenge and also added functions such as color customization, shaking, and liquid volume adjustment.

DeepSeek's performance was relatively simple but error-free.

Generated by DeepSeek V4 on the official website

The liquid generated by GPT-5.5 was a bit strange. Although it flowed in the corresponding direction as the angle changed, the overall wave was quite out of place.

Generated by GPT-5.5 Ultra on Codex

There seemed to be a bug in the web page generated by Gemini 3.5 Flash. The bottle was always hidden behind the control panel and had to be dragged out manually. However, with the same prompt, it provided a lot of customizable options, including the type of the bottle, the color of the liquid, and various settings could be customized.

Generated by Gemini 3.5 Flash on the official website, selecting the Canvas option

The bottle generated by Claude Opus 4.7 was too simple, and the simulated liquid sloshing effect in a violent state was more like the beating of sound waves.

Generated by Claude Opus 4.7 using the Claude Code app

Then we tried to let it generate a small game. Although game testing has been a common test item in last year's Vibe Coding. This time, we asked the AI to create a hexagonal 2048 game, with the prompt "Create a playable 2048 game, but with hexagonal grids."

The page generated by Qwen3.7 Max was quite nice. It could be seen that most of the 10 reference sources were from CSDN's 2048 game generation tutorials.

The final game was playable, but there were still occasional moments when it didn't follow the rules. For example, when adding the same numbers in the same direction, they weren't added in the right position.

Generated by Qwen3.7 Max on the official website

DeepSeek V4's performance was similar to the previous round. However, although it was a hexagonal grid, the keyboard controls it provided were only WASD for sliding.

Generated by DeepSeek V4 on the official website

Claude's Opus 4.7 probably performed the best in this round. It really understood how to set up the game, and the movement of the grids followed the rules of the honeycomb, which didn't make people feel confused.

Generated by Claude Opus 4.7 using the Claude Code app

Relying on Codex's capabilities, GPT 5.5 could open the browser to preview whether there were any problems after generating the game and capture the console information to fix the project code. The final generated web page was also excellent. However, in terms of monitoring the moving direction of the mouse on the screen, it didn't perform as well as Opus 4.7.

Generated by GPT-5.5 Ultra on Codex

Gemini 3.5 Flash, as always, added a lot of things. It wrote three background themes for the game: cyber, dark gold, and macaron. It even added a "built-in high-quality synthesizer."

The gameplay is accompanied by retro 8-bit space sound effects (merging, sliding, passing levels, and dying) generated by the native Web Audio, instantly enhancing the experience.

Generated by Gemini 3.5 Flash on the official website, selecting the Canvas option

Back to the design of some ordinary web pages, we asked it to design a website for a subway museum, with the prompt being just one sentence: "Design a theme website called the Subway Museum, requiring a strong sense of immersion."

Originally, we hoped that these large models could list as much subway information from different cities as possible, the logos of world subways, and the overall style of the website should be artistic, with a specific style and sufficient special effects for presentation.

To be honest, it's hard to evaluate Qwen3.7 Max. Arranging the text vertically is a bit like a subway train, but the whole website gives a messy feeling.

Generated by Qwen3.7-Max on the Qianwen official website

Gemini continued to add a lot. Sound effects were used again. Interestingly, it also created a subway cultural and creative product, a customized commemorative ticket generator. We can enter our names, select stations, and instantly generate a high - quality, retro-style subway commemorative ticket.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Domestic AI programming ranks second globally. After testing five major models, which one is the ultimate Vibe Coding tool?

Obtain Qwen3.7 Max

Test DeepSeek, Claude, GPT, Gemini, and Qwen Together