HomeArticle

Another domestic flagship model has been open-sourced. Overseas netizens: The "Big Four" of China's open-source AI have taken shape.

智东西2025-07-31 07:48
The ecosystem of domestic open-source models is experiencing unprecedented prosperity.

In recent weeks, domestic open-source models have witnessed a wave of concentrated explosions. Internet giants and AI unicorns have successively unveiled their open-source trump cards, vying to top the global open-source model list. Just this week, another domestic open-source model has become a sensation across the internet.

This model comes from Zhipu, known as the "Chinese OpenAI," and it is their latest generation flagship model, GLM-4.5. The release timing is quite coincidental - it happens to be right before the rumored release of OpenAI's GPT-5, and it also focuses on capabilities such as reasoning, programming, and agents.

However, Zhipu has seized the opportunity by going open-source, gaining a wave of traffic both at home and abroad in advance. The official announcement tweet has received over 770,000 views, and it has also received support from the CEO of the open-source hosting platform HuggingFace, who retweeted it.

Within less than 48 hours after its release, GLM-4.5 has topped the HuggingFace trending list, becoming one of the most attention-grabbing open-source models globally. GLM-4.5-Air ranks sixth. Bil Gurley, a partner at Silicon Valley's BenchmarK venture capital firm, posted that the combined effect generated by Chinese open-source AI models is very powerful. The models can improve each other, and it is easier to launch new models.

Notably, around the World Artificial Intelligence Conference (WAIC), the open-sourcing of Chinese large models has successively "gone mainstream." K2 from DarkSide of the Moon and multiple models from Alibaba have all shown excellent performance, followed by Zhipu's GLM models. As of today, almost all the top 10 models on the Hugging-Face open-source model list are Chinese large models. CNBC believes that the artificial intelligence models being developed by Chinese companies not only have improved intelligence levels but also have continuously reduced usage costs.

Moreover, an overseas AI blogger created an image meme to describe the evolution of the current AI competition landscape: The global large AI models have now split into the open-source camp represented by Chinese models and the closed-source camp represented by American models. Recently, following DeepSeek and Qwen, domestic models such as Kimi and GLM have also been open-sourced in a major way, adding more powerful players to Chinese open-source models. It seems to have formed the "Four Open-Source Heroes of Chinese AI," competing with the "Four Closed-Source Powerhouses" composed of GPT, Claude, Gemini, and Grok internationally.

GLM-4.5 is positioned as an agent base model that integrates reasoning, coding, and agent capabilities. In 12 benchmark tests covering scenarios such as reasoning, programming, and agents, GLM4.5's comprehensive performance has achieved the results of being the SOTA (i.e., ranking first) among global open-source models, first among domestic models, and third among global models.

Beyond the rankings, Zhipu has also tested the model's agent programming ability in real scenarios, comparing models such as Claude-4-Sonnet, Kimi-K2, and Qwen3-Coder in parallel. To ensure the transparency of the evaluation, Zhipu has published all 52 questions and agent trajectories involved in the above tests, for the industry to verify and reproduce. This has also been praised by netizens.

Meanwhile, Zhipu offers a highly cost-effective API pricing for the model. The API call price is as low as 0.8 yuan per million tokens for input and 2 yuan per million tokens for output; the high-speed version can reach up to 100 tokens per second. In addition, users can also use the full-version GLM-4.5 for free on Zhipu Qingyan and z.ai.

Recently, Zhidongxi has conducted in-depth experiences of multiple capabilities of GLM-4.5. The effectiveness of this model in actual production scenarios is quite surprising.

Experience links:

https://chatglm.cn

https://chat.z.ai/

Model repository:

https://huggingface.co/collections/zai-org/glm-45-687c621d34bda8c9e4bf503b

01. First-hand Test of GLM-4.5: Create a Complete Database with One Sentence, and the Thinking Process is Concise and Clear

Currently, many domestic and foreign netizens have started to experience the GLM-4.5 model. They use it to create AI personal fitness coaches, generate web games, 3D animations, etc. Its programming ability and the ability to complete long-sequence complex tasks have left a deep impression.

This benefits from the agent ability that GLM-4.5 focuses on this time. Compared with traditional static tasks such as Q&A, summarization, and translation, agent tasks place more stringent and comprehensive ability requirements on the model. They comprehensively demonstrate the key elements of large models in perception, memory, planning, execution, etc., and also provide a basis for subsequent multi-dimensional capabilities.

Agents often face open environments and require the model to have continuous perception, long-term planning, and self-correction abilities. At the same time, agent tasks are a composite process that not only involves language processing ability but also requires the model to coordinate the use of tools, execute code, manipulate interfaces, and even conduct multi-round interactive cooperation, truly testing the model's comprehensive scheduling ability. It can be seen that agent tasks are not only an ordinary task form but also a kind of "stress test."

Full-stack development is a typical agent task. To test the relevant abilities, Zhidongxi proposed a relatively complete development task to GLM-4.5 - using PHP + MySQL to create a bilingual (Chinese and English) terminology library with functions of addition, deletion, modification, and query. One of the difficulties of this task is that the model needs to plan the project framework on its own, clarify functional requirements, and design the database specifically, comprehensively thinking and solving problems like a real engineer.

Zhidongxi has also given similar questions to other models. However, many models are unable to reasonably plan the project framework and even choose to develop all functions in one web page file. Therefore, the final delivered results cannot be deployed in production scenarios, let alone be further modified or expanded.

Surprisingly, the result delivered by GLM-4.5 is relatively complete, achieving the established functions, and the speed is relatively fast. It completed the development of 3 core pages in about 2 minutes. The final deployment effect is as follows:

This result may benefit from the clear thinking process of GML-4.5 before officially starting to generate code: It accurately judged the nature of the project and understood which files should be generated, providing clear guidance for subsequent development. The thinking process is also straightforward and looks concise and clear.

Part of the conversation record: https://chat.z.ai/s/50e0d240-2034-407b-a1b3-94248dd5f449

Zhipu's official demo shows more abilities of GLM-4.5. For example, it can accurately reproduce the UI interfaces of websites such as YouTube, Google, and Bilibili according to user needs, which can be used for demo display and other requirements.

Conversation record: https://chat.z.ai/s/01079de2-a76d-41ee-b6ee-262ea36c4df7

Or it can create a web page that allows users to design mazes independently and the system to find paths.

Conversation record: https://chat.z.ai/s/94bd1761-d1a8-41c9-a2f4-5dacd0af88e9

This full-stack ability can not only be used in actual production scenarios but also for having fun. Zhipu's official has created a quantum merit box that can actually interact and save data to the background.

However, the development process of the above projects by GML-4.5 may be more worthy of in-depth discussion. Looking through the agent's execution trajectory, it can be seen that after being combined with development tools, GLM-4.5 can complete tasks more end-to-end. It first creates a to-do list, then gradually completes tasks, summarizes development progress, and conducts comprehensive verification and debugging when users put forward modification suggestions.

Conversation record: https://chat.z.ai/s/1914383a-52ac-48b7-9e92-fa105be60f3e

GLM-4.5 also shows good ability in the scenario of PPT production. It can create a complete and beautiful PPT according to the number of pages and content specified by users and enrich the visual experience of the PPT by combining search tools. For example, in the following figure, GLM-4.5 created a career review PPT for the legendary sprinter Usain Bolt.

Conversation record: https://chat.z.ai/s/544d9ac2-e373-4abc-819b-41fa6f293263

We have intuitively felt the ability of GLM-4.5 in the above multiple cases. Then, what technological innovations does this model rely on to achieve such performance? Zhipu gave the answer in the technical blog released at the same time.

02. Breakthrough in Parameter Efficiency, Compatible with Multiple Programming Agents

The training process of GLM-4.5 is divided into three steps as a whole. From the underlying architecture, task selection to optimization strategies, each stage gradually promotes the improvement of the model's ability.

First, in the pre-training stage, the GLM-4.5 series of models borrows the MoE architecture of DeepSeek-V3, but still uses Grouped-Query Attention combined with Partial RoPE in the attention mechanism.

This mechanism has been used since ChatGLM2 and can avoid the challenges brought by Multi-Head Latent Attention (MLA) to tensor parallel processing. Zhipu has also configured more attention heads because the team found that increasing the number of attention heads can significantly improve the model's performance in reasoning benchmark tests.

Both GLM-4.5 and GLM-4.5-Air have an MTP (Multi-token Prediction) layer, allowing the model to predict multiple subsequent tokens simultaneously in one forward calculation. Actual tests have proven that this mechanism can significantly accelerate the reasoning process.