Der AI-Video-Sozialraum der Sora App gibt Unternehmen wie Baidu neue Hoffnungen.
Two weeks after the release of Sora 2, Baidu's Steam Engine AI video model clashed with Google's Veo 3.1 in terms of release schedule.
The two companies' choice to release their products simultaneously was not due to some kind of tacit understanding. Instead, the pressure brought by Sora 2 forced them to speed up their paces.
Altman described Sora 2 as "the ChatGPT 3.5 moment in the creative field". It has not only achieved a qualitative leap in aspects such as physical logic, picture coherence, realism, and audio - video synchronization, but also evolved Sora from a "text - to - video" tool into a "creativity - to - ecosystem" platform.
This is undoubtedly a bombshell in the field of AI video generation. It reveals another possibility. According to OpenAI's logic, with the three pillars of Cameo (guest appearance), Remix (secondary creation), and social product design, it is sufficient to completely reconstruct the business logic of content creation.
This is not what Google, Meta, and TikTok want to see, but it is what Baidu, Alibaba, and 360 are looking forward to. The Baidu Steam Engine team admitted in a recent interview that Sora 2 has brought important inspiration in terms of productization and social fission.
Through functions like Cameo, Sora has ingeniously solved the problem of AI + social. It focuses on low - cost Remix co - creation among acquaintances rather than simply pursuing video quality. More importantly, the competition among large - model manufacturers has been upgraded from the previous simple comparison of model SOAT to the value dimensions such as product implementation and commercial monetization.
This means that defining the capabilities of applications and products is becoming as important as building models. On the other side of the ocean, this is also a way out that similar products of Sora are seeking.
Rationally View the Progress of Sora 2
How powerful Sora 2's technology is can be seen from the attitudes of Kuaishou and Baidu after its release. The former claimed that its AI team worked non - stop for 8 days during the holiday, while the latter's vice - president gave an interview, saying that they had been working intensively for more than 50 days and also added that "it was legal overtime during the National Day and Mid - Autumn Festival holidays".
Compared with the early Sora text - to - video model, the core upgrades of Sora 2 are mainly reflected in generation quality and interaction ability:
1) Physical consistency has been significantly optimized, and the dynamic modeling of rigid bodies, fluids, occlusion, and collisions is more accurate;
2) Controllability has been enhanced, and the camera movement and narrative rhythm can better respond to user script instructions;
3) A new native audio function has been added, which can generate dialogue and environmental sounds synchronously;
4) The picture styles cover multiple types of scenarios such as realism, movies, and animations, and the overall performance is more stable.
On the previous basis, Sora 2 has achieved functions that were difficult for previous video models to reach (more accurate physical effects, clear realism, synchronous audio, strong controllability, and a wide range of styles). It can precisely follow instructions to create videos with both imagination and real dynamics, which not only expands the tool library for narrative and creative expression but also moves towards a model that can accurately simulate the complexity of the physical world.
In short, Sora 2 is a more aggressive iteration. It has fixed its previous shortcomings and once again led the industry. However, in terms of video quality, Sora 2 is not absolutely leading.
Huatai Securities recently conducted a comparative test under the same prompt and found that the video generation quality of Keling and Jimeng is still better than that of Sora 2 as a whole. Among them, Keling 2.5 Turbo has topped the list of text - to - video on Artificial Analysis.
Baidu's Steam Engine AI video model is not only the world's first video generation model to achieve integration of Chinese audio and video. The latest version is also targeting Sora 2. Users can support the generation of videos with unlimited duration and can interact in real - time during the generation process, rewriting content or expanding sequels at any time.
In other words, in terms of technology, Sora 2 hardly has an overwhelming leading advantage, but in terms of the model, it is far ahead this time.
On the fourth day after the launch of the Sora App, it topped the free app list in the US App Store, surpassing OpenAI's ChatGPT and Google's Gemini. Currently, Sora is still in the invitation - only testing phase and is only available on iOS devices in the United States and Canada. Despite these limitations, Sora still topped the Apple App Store ranking in the US.
According to data from the app intelligence provider Appfigures, in the case of being limited to the United States and Canada and using the invitation system, the iOS application of Sora received a total of 164,000 downloads in the first two days after its launch on September 30th and October 1st.
In terms of the first - day download volume, although Sora is not as good as ChatGPT, it is on par with Grok launched by xAI. However, considering that Sora is not fully open, its market potential may be even greater.
This is why a few days after its initial release, Sam Altman's clubbing video quickly disappeared from short - video platforms. Essentially, the Sora APP represents OpenAI's core strategic transformation from a "single - dialogue tool" (ChatGPT) to an "ecological social platform".
More straightforwardly, the Sora APP is here to take the place of short - video platforms. It is very likely to be the next - generation short - video platform.
The media and securities firms also regard Cameos and Remix as two revolutionary functions. They believe that Sora 2 is not just a simple video generation and creation tool, but the TikTok of the AI era.
Cameos: Users only need to conduct a one - time short audio - video recording in the app to verify their identities and capture their images. The Sora 2 model can then reproduce the uploaded images with amazing fidelity. After that, users can authorize the use of their virtual images and place them in any AI scenario to create "guest - appearance videos" with their personal images.
Remix: The built - in editing tool only allows users to input prompts to conduct "secondary creation" on any video and trend on the platform, generating their own versions.
More importantly, this layout is not a simple extension of functions but a deep - level optimization of the growth logic of AI products. It marks OpenAI's transformation from an "AI tool provider" to an "AI ecosystem builder":
Through the Sora APP, it connects the complete link of "model ability → user scenario → commercial monetization". This not only avoids the lack of growth caused by the single - tool attribute but also consolidates its leading position in the field of AI - generated content with the double moats of "data flywheel + social network".
AI Video Social: Trying to Overturn the Short - Video Game
Some people summarize that the reasons why the Sora App has continuously ranked first in the iOS free list since October 4th mainly come from three aspects:
1) Rich UGC gameplay. Through the Cameo function of having friends appear in the video and the Remix secondary creation function, users can generate immersive interactive videos, appearing in the same frame as friends or celebrities (such as Altman). AI also makes creative plots that deviate from reality possible, which is both interesting and has social attributes.
2) The invitation system promotes social fission. Sora uses the invitation code system. New users can experience the app by entering the invitation code, and each new user can invite 4 friends to participate. This not only ensures the consistency between the seed users and the target group but also enhances the psychological value of the product through the sense of scarcity.
3) ChatGPT has a deep user base at the consumer level. The monthly active users (MAU) of ChatGPT's web version and mobile version in September reached 790 million (Similarweb) and 270 million (SensorTower) respectively, ranking first among large - language models. The traffic foundation is solid. At the same time, the web version of Sora is bundled with ChatGPT membership for sales, which can promote user diversion.
It is very likely that domestic AI video products will follow this strategy, especially those products that have technology but lack social features, such as Baidu, 360, and Alibaba. After all, domestic products emphasize more on video content creation. Interactive gameplay similar to Cameo and Remix has not been implemented yet, and the consumer - level community culture is still in its early stage.
For ByteDance and Kuaishou, it is not impossible to launch an independent AI video app in the domestic market, as it can also achieve user diversion.
Although the Sora App does not pose a threat in the domestic market, it still has an impact in the overseas market. The reason why the Sora App is called the "AI version of TikTok" is that its interface and the recommendation logic of the home page are similar. The home page of the app is a vertical video stream, and users can scroll up and down to browse the content posted by other users.
But this is not the most crucial part. OpenAI is rapidly building a new IP - driven ecosystem around video generation.
Its core lies in two points: "granular control" and "revenue share". This means that Sora will no longer be a simple tool but an economic platform connecting IP owners and hundreds of millions of creators around the world.
The so - called granular control means that OpenAI will provide copyright holders with more fine - grained control, allowing them to more precisely manage character generation. It is similar to the "portrait consent" model but will add more control options.
And revenue share means that OpenAI plans to share a part of the revenue with copyright holders who want users to generate their characters.
In this way, Sora is expected to form a revenue - sharing business model of "IP + creators" led by the APP platform.
For IP owners (Hollywood, major game companies, Japanese manga publishers), their dormant IP asset libraries have become "oil wells" that can be exploited 24 hours a day. They can not only earn licensing fees but also maintain the popularity and vitality of their IPs with the help of the creativity of global creators. They can even guide subsequent creation through data feedback (which characters and styles are the most popular).
For creators, they can finally legally, compliantly, and at low cost use characters like Batman, Pikachu, and even characters from "The Three - Body Problem" to create videos. Creation has changed from a "technical job" to a "creative job", and the core ability has changed from operating software to prompt engineering and aesthetics. More importantly, your popular videos can bring real - money platform revenue sharing.
Epilogue
The social attribute of Sora 2 has transformed it from a "tool" into a "platform". Although the ability to define applications and products has been elevated to an unprecedented height, it has also opened up new ideas for its similar products.
In the past, more investment in AI video products was focused on video generation quality. Applications were mostly targeted at a few B - end industries such as advertising, education, and self - media, and there was little involvement in the consumer level.
However, after the Sora App, the situation may change.
On the one hand, domestic companies are striving to catch up with the technological progress of Sora 2;
On the other hand, the development of social attributes has made companies like Baidu, which lack social resources, smell the second possibility of AI video.
And Sam Altman clearly mentioned in the Sora update information that "providing a monetization mechanism for creators". That is, after Sora complements its editing functions and adds a user incentive mechanism, it may find a monetization path for users in a short time. Once the snowball starts to roll, Sora has the opportunity to become a closed - loop platform giant like TikTok, where users produce and consume content.
And this is exactly what Baidu and others hope for.
This article is from the WeChat official account "Decoding NewSight", author: Yuan Xile. It is published by 36Kr with authorization.