HomeArticle

Lark vs Wanjing Yike: Talking about the First Round of Content Agent Battle between ByteDance and Alibaba

壹娱观察2026-06-12 21:36
The market may not be short of tools, or even content.

The "battle of a hundred models" among domestic large models has yet to see a clear winner, but the competition has already shifted to the agent level.

At the Alibaba Cloud Summit on May 21st, Alibaba released the end-to-end AI video creation platform WonderClip, which was opened to the entire industry on the same day. However, ByteDance took action earlier. Xiaoyunque Agent, incubated by the Jianying team, has been running on both the web and App platforms. After the release of Seedance 2.0 in February this year, the number of users increased. In mid-March, it launched the industry's first short drama and comic drama Agent equipped with Seedance 2.0, which can directly produce video clips after uploading the script. In the face of competition, it quickly iterated to version 2.0 not long ago.

Screenshot of Xiaoyunque's official website

Of course, due to the different ecosystems they are connected to and good at, Xiaoyunque and WonderClip have their own focuses in terms of content traffic and e-commerce marketing. However, as content creation agents, both products promise to complete the process from idea to finished video in one go, and they are targeting the same business of short dramas, marketing, and online novel adaptations. It is reasonable for them to be regarded as another head-to-head confrontation between ByteDance and Alibaba.

However, at present, the market may not lack tools or even content. After a large amount of content is produced, the real problem becomes who will consume it, and this is exactly the part that AI cannot solve.

01. Confrontation with Different Approaches but the Same Goal

Xiaoyunque, backed by a complete content ecosystem, clearly has a first-mover advantage. According to data released by QbitAI in March, the DAU of its App exceeded 800,000, a month-on-month increase of 54%, and the download volume increased by 122% month-on-month.

In terms of functions, it can produce a 15- to 60-second video clip by inputting a theme. You can choose a commercially available digital human image and add a script to create a voice-over. If you paste a Douyin or Xiaohongshu link, it will analyze the shot logic, rhythm, painting style, and BGM, and output a similar video. You can get 120 points for logging in every day, and it is basically free at present.

Its model ability is essentially the ability of Seedance 2.0.

Screenshot of the video generated by Seedance 2.0

Seedance 2.0 supports the mixed input of up to 12 materials. Pictures are used to define characters, old videos are used to define camera movements, and audio can drive the rhythm of the picture. Coupled with physical law modeling and native audio-video synchronization, these form the basis for Xiaoyunque to serve as a short drama Agent.

Of course, the shortcomings of the model are also passed on to the product. According to the actual feedback from many users, it cannot achieve fine control at the shot level, there are limits to the duration and resolution, the style is inconsistent across segments, it cannot support complex narratives, and the queuing time during peak hours has reached up to eight hours.

The more crucial fact is that the underlying technology of video generation for Doubao, Jimeng, Jianying, Suibian, and Xiaoyunque is all the same Seedance 2.0. The differences between them only lie in the functions available, the generation speed, and the quota.

ByteDance does not have an independent technical system called "Xiaoyunque". It is one of the many outlets of the same model, and its task is to target the vertical scenarios of short dramas and comic dramas.

Screenshot of Xiaoyunque's official website

The free strategy brings in content supply, and the generated videos flow back to ByteDance's distribution system. Whether Xiaoyunque makes money is not a question that needs to be answered in this structure.

WonderClip, built by Alibaba Cloud based on Happy Horse, has an obviously different approach. It focuses on the process.

The platform is divided into five modules: intelligent script analysis, storyboard generation, main body creation, online editing, and asset management.

After uploading the script, it will automatically analyze the structure, generate a storyboard script and camera movement instructions, and automatically assign shots, scenes, and characters. The intelligent analysis mode for disassembling long novels will be launched soon. There are three forms of the workflow: storyboard, infinite canvas, and a dialogue-based video generation mode with multiple Agents such as screenwriters, directors, and prompt engineers working together. The last one is still in gray-scale testing.

Screenshot of WonderClip's official website

The underlying technology consists of a series of Alibaba's own models. Wanxiang Wan2.7 serves as the foundation, Happy Horse is responsible for video generation, and Qwen-image and Z-image are responsible for image and rendering. In terms of model ability alone, this combination may not have an advantage over Seedance 2.0. Happy Horse won in previous benchmark tests, but the gap is small in the audio-included track, and its ability to handle multi-material joint input has not caught up yet.

Its real differentiator may lie in the so-called "main body creation". Characters, scenes, and props can be separately created into multi-angle main images and saved as the visual archive of the project. Then, all shots can use the same set of materials, ensuring that the face remains the same across shots and the style does not change. The materials generated at one time thus become assets that can be repeatedly used.

This design is of great use to short drama teams. After all, reusing materials or assets can save a lot of costs. However, its underlying service target may be more the e-commerce teams, which seems to be confirmed by the current list of customers.

Screenshot of the video generated by Happy Horse

The announced customers include Xiaowu Brothers, which is engaged in short drama overseas expansion, A.O. Smith, which creates digital avatars for salespeople, and Titanium Mobile Technology, which focuses on large-scale content production.

As for commercialization, WonderClip provides a brand-specific space with an independent domain name and UI, full-stack API integration, multi-studio mode, and permission isolation. It opens up the Skills and API for audio and video creation to enterprises with existing Agent architectures, supports "integration" into the other party's workflow, and delivers through three channels: tool platform, API, and brand suite, plus a layer of authorized service providers.

By comparing the two, it becomes clear that this confrontation is not simply a competition of product or model ability.

Xiaoyunque is more like another entrance to ByteDance's Seedance, providing content for Douyin through a free strategy. WonderClip is an extension of Alibaba Cloud, betting that enterprises are willing to pay for the process and asset management. One serves ByteDance's content ecosystem, and the other serves Alibaba's cloud business. The two products mainly serve the task of obtaining tokens for their respective groups.

02. Except for ByteDance, Film and TV Agents are a Thankless Task

The real question worth discussing lies beyond these two products.

Tencent has Shangtouwa, ByteDance has "Xiaozhangyu" of Jimeng in addition to Xiaoyunque, and Kuaishou has Keling. Almost every company with a video model has an agent, and their functions are highly overlapping: one-sentence video generation, storyboard, digital human, and end-to-end process. Last year was the "battle of a hundred models", and this year, the competition at this level is no less intense.

All these agents can only change the supply side.

The key is that the figures on the demand side have reached a ceiling for a long time. The user scale and average usage time per person of the domestic mobile Internet have basically not increased in the past two years. The monthly active users of the short drama industry are 718 million, which is very close to the upper limit of the total number of Internet users.

Source: QuestMobile

The total amount of users' attention is fixed. If they watch one video, they can't watch another. Even for the short drama industry, which has been growing rapidly, it has clearly stagnated in the past six months.

The supply side is expanding.

According to DataEye, in January this year, 14,634 AI comic dramas were launched in a single month, with an average of more than 470 per day. In March, the daily new comic dramas on just the Hongguo platform reached 2,000, which is 20 times that of live-action dramas. This is the production capacity before the large-scale implementation of agents.

However, recently, there have been a lot of news about bans on short drama themes. It is obvious that the theme dividend for short dramas is almost exhausted.

In a market where demand has stopped growing, improving efficiency cannot bring in incremental growth. Although the cost of a single video has indeed decreased, everyone's cost is decreasing at the same time. The result of the competition is only a decrease in price and an increase in output. The attention and return for each piece of content are shrinking simultaneously. The money saved from cost reduction is also difficult to retain, and most of it will become the budget for advertising, which is returned to the platform.

Source: Internet

The original meaning of the word "involution" describes this state: continuous increase in input but no growth in total output.

Therefore, from the very beginning, this round of agent competition lacks an outlet. It consumes real computing power and produces a large number of videos that no one finishes watching, and the added value is close to zero.

The real beneficiaries are the model layer and the distribution layer -

Xiaoyunque doesn't make money on its own. Its task is to supplement the content supply for Douyin. WonderClip charges fees, but the income essentially goes to the computing power and models of Alibaba Cloud. The agent layer itself can retain very little. It is more like an entrance to the respective ecosystems of large companies rather than a product.

Looking further into this structure, the conclusion is even more unpleasant: in the current ecosystem, except for ByteDance, it is a thankless task for other companies to develop film and TV content agents, which is simply a waste of computing power.

ByteDance is the only player with a closed-loop ecosystem.

Some apps in ByteDance's closed-loop ecosystem

The model is its own Seedance, the tools are Xiaoyunque, Jimeng, and Jianying, the distribution platforms are Douyin, Hongguo, and even TikTok, and the monetization is through advertising and free reading-based traffic business. The content does not leave its own system from production to consumption. At least the computing power burned by the agent has brought in content supply, and this account can still be calculated.

Other companies do not have this "closed-loop" condition.

Alibaba does not have its own content consumption platform. Most of the short dramas mass-produced by WonderClip for customers still have to be promoted on Hongguo and Douyin in the end. It is equivalent to burning the computing power of Alibaba Cloud to expand the content supply for ByteDance, while only making a little money from tools and APIs and getting no share of the large distribution revenue.

Kuaishou has a distribution platform, but 70% of Keling's income comes from overseas individual subscriptions, and the closed-loop from production to distribution in China has not really been established. Tencent's Shangtouwa is also facing the same problem...

In the current situation where computing power is already in short supply, these investments are more like a fear of missing out rather than a strategic layout.

Source: Internet

This does not mean that agents have no value. It is a fact that they lower the threshold for creation, and WonderClip's assetization idea does address the real pain points of short drama industrialization.

However, having value and being worth developing by every large company are two different things. This judgment applied to the previous video large models, and it still applies to agents at present.

ByteDance and Alibaba naturally care about the outcome of the competition between Xiaoyunque and WonderClip. However, in a market where demand no longer grows, no matter how intense this battle is, it is still a waste of resources and will not generate any