HappyHorse has no surprises.
Author | Ranger
Editor | Joe Qian
On April 27th, the long-awaited HappyHorse finally started its testing. Unfortunately, it failed to make as big a splash as the newly emerged Seedance 2.0. No surprises - this is a rather fair evaluation of HappyHorse.
HappyHorse is a video model developed by the Innovation Division under Alibaba's ATH Business Group. It started its gray testing on April 27th and was integrated into the Qianwen App.
There are two reasons why this video model has received so much attention.
Firstly, before the open testing, without indicating the manufacturer, HappyHorse topped the rankings of the authoritative AI evaluation platform Artificial Analysis AI Video Arena, which mainly conducts blind tests. With a higher Elo score (a number that dynamically calculates the strength ranking based on the wins and losses of the competition and the strength of the opponents), it outperformed video models such as ByteDance's Seedance 2.0, Kuaishou's Keling AI, and Google Veo 3 Fast, and became an instant hit.
Subsequently, there was a continuous stream of discussions about its origin and capabilities. There were even several fake official websites impersonating it, attracting countless onlookers who were unaware of the situation.
Secondly, this video model is backed by Alibaba. Three days after it topped the evaluation rankings, on April 10th, Alibaba's ATH Innovation Division officially claimed it.
Both HappyHorse and the ATH Business Group it belongs to are quite young. The latter was established in March by Alibaba's CEO, Wu Yongming, who personally leads it. It integrates five major sectors: Tongyi Laboratory, MaaS Business Line, Qianwen Division, Wukong Division, and AI Innovation Division. According to the official narrative, the ATH Innovation Division has launched a new interactive method exploration plan for the AI era, and HappyHorse is part of this exploration direction. More products will be launched successively.
According to people close to Alibaba, after the establishment of the ATH Division in March this year, Alibaba set a goal of reaching a daily consumption of one million for its AI business. To make up for its shortcomings in multi-modal large models and to promote the consumption of Tokens, Alibaba accelerated the implementation of the large video generation model, and HappyHorse is the product of this strategy.
In a situation where Seedance 2.0 dominates the market, has a relatively high price, and has a long waiting list, the industry is also looking forward to the emergence of a new video model with capabilities comparable to Seedance 2.0. However, after the testing was opened, many industry practitioners expressed their disappointment. During the Spring Festival of the Year of the Horse, the icon of the Dream App changed from a spinning top to a small horse, which is now interpreted by the media as "grasping the little horse" in a rather meaningful way.
Image source: Screenshot of the official page
Without a technological leap, there's only catching up
What is the technical ability of HappyHorse really like?
Glowave, a tool independently developed by the content technology company Sansheng Qingying, has been integrated with HappyHorse. After in-depth experience with this large model, Jiang Yiqi, the founder, evaluated the model as performing well but slightly inferior to Seedance 2.0.
Jiang Yiqi graduated from Tsinghua University's Computer Vision major and once worked at Alibaba's DAMO Academy. He has a deep understanding of large video models. He told 36Kr that compared with Seedance 2.0, HappyHorse is lacking in the sense of film and television and the restoration of prompt words. Specifically, the former refers to an effect closer to traditional professional film and television performance, including the fineness of the picture and the richness of the background. The latter can be simply and roughly understood as the ability to understand human language.
36Kr also evaluated three products: Seedance 2.0, Keling 3.0, and HappyHorse. Using the same prompt words and clarity, videos of the same duration were generated. After watching the two videos generated by Keling 3.0 and HappyHorse, Jiang Yiqi thought that the latter was a bit inferior in aesthetics but better in the restoration of keywords and physical authenticity. "If I were to score these two videos, I would give Keling 3.0 8 points and HappyHorse 9 points."
He further explained, "After all, HappyHorse 1.0 is a 1.0 version, and this starting point is already very good. The recent decline in the performance of Keling 3.0 may be because it is using its computing power to develop a major upgrade."
The evaluation video of Seedance 2.0 failed to be generated successfully. As of the time of publication, 36Kr still had to wait for ten hours in the queue.
Generally speaking, HappyHorse mostly makes up for the existing capabilities of video models on the market and fails to make a qualitative breakthrough.
In fact, the hard power of HappyHorse is not bad - with 15 billion parameters, almost three times that of Seedance, it supports multi-shot narrative within 15 seconds, multi-frame adaptation, and 1080P super-resolution output. In other words, HappyHorse can also generate a 15-second video with shot lists and synchronized audio and video with one click.
If these capabilities were available three months ago, they might have made the film and television industry re-examine its existing production processes and organizational structures. However, they are now standard features of large video generation models on the market and have a high degree of overlap with Seedance 2.0 and Keling 3.0.
As for why, with training parameters several times that of Seedance, its performance fails to keep up, Jiang Yiqi analyzed that it may be related to the data quality - HappyHorse has a certain gap with ByteDance and Kuaishou in terms of short video data and film and television-level video data.
Now, the large video generation models in China have fallen into a fierce and homogeneous competition. Being just satisfactory is far from enough.
An employee of a leading video model manufacturer told 36Kr that their boss once said that the core standard for measuring the ability of a large model is "intelligence," which can be understood as whether the iterative update of the large model can change the production structure of an industry. For example, with the emergence of Seedance 2.0, storyboard artists are no longer needed.
In addition to making a qualitative breakthrough, the speed also needs to keep up. Now, the industry generally requires a new version of the large model to be updated every 1 - 2 months; otherwise, there is a risk of falling behind. In this context, mediocrity cannot break through.
Not long ago, Keling 3.0 updated its functions and can directly output 4K videos, which is in line with the direction of the AI video industry's efforts to move towards the big screen. A new round of competition has begun, and HappyHorse has just entered the stage.
Being cheap can't be a moat
Is HappyHorse, which is not outstanding in terms of technology, competitive in terms of price and commercialization?
Zhao Yucheng is the business director of Deyun AIGC. His company acts as an agent for selling the APIs of large models such as Seedance 2.0 and Keling 3.0. Now, they have also reached a cooperation with HappyHorse.
Zhao Yucheng told 36Kr that HappyHorse mainly aims at Seedance 2.0 and hopes that some large customers who have signed annual contracts with the latter can switch to its own model. However, since it is still in the early stage, HappyHorse does not have high expectations for commercialization and does not have specific goals.
Before Seedance 2.0 opened its API, if you didn't want to wait in the queue in the Dream App, you needed to sign an annual contract worth 10 million yuan with Volcengine. You could pay a 20% deposit first, and the remaining money needed to be used up in the form of consuming Tokens by calling the large model within a year. A practitioner who signed an annual contract told 36Kr that there is no mention in the contract about what to do if the Tokens are not used up.
"This leaves room for some users to migrate from Seedance 2.0 to HappyHorse," Zhao Yucheng said.
Shortly after Alibaba claimed HappyHorse, many people approached Zhao Yucheng to inquire about the situation, so he had to add a sentence to the company introduction: "HappyHorse still needs to wait." At that time, the people at Volcengine were also quite sensitive and always paid attention to the dynamics of HappyHorse.
However, after the open testing the day before yesterday, no customers have expressed their willingness to migrate from Seedance 2.0 to HappyHorse.
The price of Tokens is constantly rising. For some AI film and television companies, in addition to labor costs, the biggest expense is the purchase of computing power. Another large model service provider told 36Kr that many small and medium-sized customers only care about the price. Some companies even include the Token usage in their performance appraisal in an attempt to save computing power.
Therefore, a lower price can indeed bring a competitive advantage to the commercialization of a large model, but the price advantage cannot exist independently.
One day before the open testing, HappyHorse announced its pricing.
Specifically, the cost of generating 720P and 1080P videos per second for this model is 0.9 yuan and 1.6 yuan respectively. After the limited-time discount, the monthly price for professional members is 0.44 yuan and 0.78 yuan per second respectively. This pricing is similar to that of Keling but cheaper than that of the Dream App.
Service providers can get a certain tiered discount from HappyHorse. Based on the list price - when the daily call volume is about 10 billion Tokens, corresponding to a market price of nearly 100,000 yuan, they can get a 20% discount; when the daily call volume reaches 100 billion, they can get a 30% - 40% discount.
Even for annual contract users, using Seedance 2.0 costs one yuan per second. In comparison, HappyHorse has a certain price advantage, but not much.
"If there are no highlights (in product capabilities), people's attitude is that it's dispensable, and the discount has little influence." "People won't choose an inferior product just because it's cheap." This is Zhao Yucheng's on-the-spot feeling.
Since the ability of a large model is strongly correlated with the consumption of Tokens, it's not that the lower the price, the more money you can save. Take the comic drama industry as an example. A more capable product can reduce labor costs and the number of card draws, achieving cost reduction and efficiency improvement.
In fact, Seedance 2.0 and Keling 3.0, which lead in terms of ability, have always been in a seller's market. The latter even only has a direct sales team of less than ten people so far.
In other words, those video models that can truly achieve a leap in intelligence and redefine industry rules will never lack users. Since its launch, the price of Seedance 2.0 has not decreased but increased, and it has still become the main force driving the revenue growth of Volcengine.
(Peng Qian also contributed to this article)