StartseiteArtikel

20 Tage lang rennen wir wild dahin und "jagen" Sora2 im gesamten Netz.

定焦One2025-10-20 10:09
Technologischer Sturmangriff und Branchenangst

The release of Sora2 has once again stirred the nerves of the global artificial intelligence industry.

On September 30th, OpenAI officially launched a major upgraded version of its video - generation model, Sora2, and the social application Sora App based on this model. Compared with its predecessor, Sora2 has significantly improved in the accuracy of physical simulation, picture realism, and generation controllability, and has achieved the synchronous generation of audio and character dialogues. This not only makes AI - generated videos more "lifelike" but also simplifies video production to be as easy as "writing".

In just five days, the download volume of Sora App exceeded one million times. According to data from the app analysis agency App Figures, its iOS downloads in the first week reached 627,000, higher than the 606,000 of ChatGPT in its first week. Although Sora App is still in the "by - invitation - only" stage, its download growth rate has exceeded that of ChatGPT back then. Some view it as the AI version of "TikTok" and expect it to become the next global phenomenon - level application.

In China, Sora2 has also sparked a craze. During the National Day holiday, Sam Altman, the CEO of OpenAI, became a trending figure on social media. He opened up his personal portrait rights, igniting the creative enthusiasm of netizens. There were viral short - videos showing his digital doppelganger being caught stealing GPUs from a store or fighting with Bruce Lee. These widely - spread short - videos have also accelerated the popularity of Sora2.

Beyond the user side, the release of Sora2 has also given the industry a boost. Large companies are accelerating product iterations. On October 15th, Baidu announced a new upgrade of its video - generation model, Baidu Steam Engine, which supports real - time interactive generation of long AI videos. The next day, Google released its video models Veo3.1 and Veo3.1Fast, highlighting richer audio, stronger narrative control, and realism. Meanwhile, startup teams are also sprinting full - speed. Several entrepreneurs in the AI video field told "Focus One" that they have been working overtime recently, and two of them can only answer the phone late at night.

Now, the public's enthusiasm has stabilized, but within the AI industry, a quiet technological competition is in full swing.

Those Chasing Sora2

Like a starting gun, Sora2 has propelled the AI video field into a "super - acceleration" phase.

Right after the National Day holiday, news spread on social media that the AI team at Kuaishou worked non - stop for eight days during the holiday to catch up with Sora2's technological progress.

Ding Yi, a documentary and advertising director focusing on AIGC creation, told "Focus One" that, as far as he knows, almost all domestic AI video startup teams have entered a "full - staff overtime" state. He predicts that domestic Sora2 - like products may appear within two months. "All the large companies and model providers are in a fierce competition."

Wu Jiexi, the founder of Haoye Technology, also confirmed this sense of urgency. Her team has been working around the clock to test and disassemble Sora2. Her startup project, FilmAction, is an AI movie - generation platform that overlaps with Sora2 in many functions. The emergence of Sora2 has made her both excited and pressured. She is excited that the technological ceiling has been raised again, but anxious that the industry's iteration speed has exceeded everyone's expectations.

Just half a month after the release of Sora2, on October 16th, Google launched Veo3.1 and Veo3.1Fast, which is generally interpreted as a move to directly compete with OpenAI.

On the same day, OpenAI also announced two upgrades to Sora2: Pro users can now use the "storyboard" function on the web version, and the video - generation duration has been increased across all platforms. Regular users can generate 15 - second videos on the app and the web, while Pro users on the web can generate 25 - second content, significantly longer than the previous standard version's 10 - second and Pro version's 15 - second limits.

"This is very similar to when ChatGPT first emerged," Wu Jiexi said. "Everyone is desperately trying to catch up."

Based on the statements of multiple industry insiders, the shock of Sora2 comes from technological breakthroughs in three aspects:

Firstly, it is the breakthrough in simulating the physical world. Sora2 can accurately simulate water flow, light and shadow, gravity, and collision effects, and even handle complex physical scenarios such as buoyancy and center - of - gravity transfer, greatly enhancing the coherence of character movements and the stability of the main subject.

Secondly, it is multi - modal fusion. Sora2 can directly generate synchronous audio, automatically matching environmental sounds, action sound effects, and multi - language dialogues. Before this, only a few software programs had this function, but not only was the sound quality poor, but also the voices and lip - sync of characters often did not match. Ding Yi's team also tested Sora2 with different dialects, and the accents were natural and the lip - sync was accurate, indicating its technological maturity.

The real disruption of Sora2 is not just the "real - looking" generated videos, but its understanding of "shot language".

Wu Jiexi's tests confirmed this. When she directly input the original text of a novel or a script into Sora2, the generated videos not only highly matched the text in terms of pictures but also reached the level of professional creators in terms of shot usage, rhythm control, and other audio - visual languages.

Before Sora2, AI video creation was always limited by the "lack of shot thinking". In the past, creators had to manually disassemble the script, carefully consider the logic of shot connections, the selection of character perspectives, and the ways of scene transitions, which took a lot of time. Most AI tools on the market could only generate single simple shots. To create a continuous narrative, users needed to have professional audio - visual language knowledge and storyboard design skills, which set a relatively high threshold.

Sora2 has broken this limitation. Users only need to give a one - sentence text instruction, and it can automatically generate a complete video with multiple shot transitions and a coherent plot. In other words, Sora2 is no longer just a "picture - generation tool" but has initially developed the narrative logic of a director and the shot - scheduling ability of an editor.

"If we compare Sora2 to an editor, his ability has surpassed 95% of people in the market," Ding Yi believes. Other AI video software is currently only an auxiliary tool, but Sora2 has, to some extent, taken on the prototype of an "intelligent agent".

The Collapse of the Creative Threshold: AI is Rewriting "Professionalism"

However, on the other side of the rapid technological development is the loosening of the industry order. When the AI video field enters the "post - Sora era", those who have relied on professional barriers for survival are often the first to feel the impact.

"Excited yet anxious," Ding Yi summarized his feelings in the twenty days since the release of Sora2.

His team was among the first batch of test users, and Sora2 almost immediately changed their work mode. In the current work process of Ding Yi's team, Sora2 has been deeply integrated and undertakes a lot of preparatory work, such as storyboard design. By registering four or five accounts, they can quickly generate a large number of plans and select the most satisfactory one. The efficiency is much higher than manual work, and the quality is also good, including factors such as video concepts, atmosphere, and shot movement methods.

Sora2 can generate a 15 - second finished video with just one sentence, which means that there is basically no technical threshold for some low - cost commercial orders that his team usually takes. He told "Focus One" that some small promotional ads on YouTube are already using Sora2 for generation.

Another creator, Deng Deng (hereinafter referred to as "Deng"), was also affected.

Image source / pexels

In his latest short film, he used Sora2 to conceive several storyboards. By uploading reference pictures first and then describing the story background and plot in text, Sora2 can automatically generate a video with 3 - 4 storyboards, fully presenting the plot. According to his calculation, an expected version can be obtained with an average of three operations, and the "success rate" is much higher than that of other software.

He was deeply impressed by the creative convenience brought by technological progress. However, after the excitement, Deng also had a hint of unease. Storyboard design used to be the dividing line between professional creators and ordinary users, but Sora2 is blurring this threshold.

Deng told "Focus One" that before Sora2, no software supported automatic storyboard generation. Some software could do some simple storyboards, but still required clear prompts from users, such as what the first shot was and what the second shot was, and then it would give a storyboard combination within ten seconds.

Sora2 can directly generate a dynamic video. For example, when a netizen input the last sentence of "Xiang Jixuan Zhi" by Gui Youguang into Sora2, the first shot of the generated video was a close - up of Gui Youguang and the loquat tree, and the second shot was a flashback of Gui Youguang and his wife planting the loquat tree. The flashback shots and the shots of Gui Youguang missing his wife while looking at the tree were switched back and forth. In this video, the camera positions, angles, and shot transitions were all designed by AI.

In the AI era, the dissolution of professional barriers due to technological progress may be a challenge that many people need to face.

New occupations such as AI directors and AI storyboard artists were originally new benefits brought by AIGC. However, as Sora2 gains the ability to "understand scripts", these positions may be phased out again.

Ding Yi has a deep feeling about this.

When he entered the industry, he was a storyboard artist. Later, he joined the director's team, became an executive director, and finally worked his way up to a director. In the past, as long as one was proficient in a single tool, even just Photoshop, they could find a job. But now, the space for pure - technology positions is getting smaller. Earlier this year, when he was shooting an experimental short film, he tried to find some storyboard artists, but it didn't work out because "the efficiency was too low and the communication cost was too high".

After the anxiety, he began to adjust his mindset. At least for now, human review is still needed for the final results of AI - generated content. Personal experience, aesthetics, and judgment will all affect the final outcome. Ding Yi believes that in the future, interactive AI will be a human tool, just like pens and keyboards today, but ultimately, it is still content and creativity that will be the deciding factors.

Jensen Huang, the CEO of NVIDIA, once told the media: "If there are no new ideas in the world, the productivity improvement brought by AI will eventually lead to unemployment." This statement is especially relevant in the upheaval brought by Sora2.

From "Technological Wonder" to Real - World Challenges

In an era when AI applications are emerging one after another, the transition from popularity to decline often only takes a few days. Many people are also waiting to see if Sora2 is just a "flash in the pan".

In terms of popularity, Sora2's heat has indeed cooled down.

WeChat Index and Baidu Index show that the peak popularity of Sora2 in China only lasted for a few days and then declined rapidly. Deng believes that on the one hand, domestic users cannot directly access Sora2, and on the other hand, the limitations of clarity and watermarks make it difficult for creators to use it commercially, thus weakening the topic's popularity.

Screenshot of the WeChat Index trend of Sora2 in the past 30 days

Wu Jiexi pointed out that as a news event, its popularity is bound to decline, but as a new creative tool, its popularization is just beginning.

In terms of prospects, a research report from Dongguan Securities also supports this view, stating that the release of Sora2 and its supporting social application marks the integration of AI video generation and social interaction. "It is expected to reshape the content creation and distribution ecosystem and may usher in the ChatGPT moment for AI video generation."

Looking back at the development history of language models, the emergence of ChatGPT was a decisive turning point for AI to move from the laboratory to the public. In this sense, Sora2 also marks a key inflection point in the field of video generation, that is, from technological experimentation to widespread application.

However, for products like Sora2 to become a tool for universal expression, they need to overcome more than just technological challenges.

Firstly, there is the fog of copyright. In the early stage, Sora2 adopted an "Opt - out" mechanism, which defaulted to using publicly available Internet content to train the model and shifted the burden of proof for rights protection to the copyright holders. This approach quickly led to strong resistance and legal threats from Hollywood talent agencies, the Motion Picture Association of America, and even the Japanese government.

Facing the collective pressure, OpenAI quickly adjusted its strategy. Altman announced the abandonment of the Opt - out mechanism and switched to a more cautious "Opt - in" model, which requires copyright holders to sign a clear authorization agreement before their IP can be used. Altman also suggested introducing an IP revenue - sharing mechanism to share the platform's revenue with the authorization parties.

Some lawyers believe that OpenAI's new mechanism has shifted the conflict from legal disputes to business cooperation. Although there are still limitations, it indicates that the AI industry is moving towards a new stage of paid licensing and ecological co - construction. A comment from a Hollywood producer is quite representative: "In the future, the operating model of film and television companies may be more similar to that of copyright management agencies rather than traditional content producers - this trend is almost inevitable."

Secondly, there is the question of the monetization model. Currently, the main usage scenarios of Sora2 are still mainly for entertainment, such as generating funny videos or emojis. This type of low - value, high - frequency interaction cannot support the huge costs of model training and operation. In the future, the "paid model" for professional users or high - quality content creators may become the mainstream. How to balance advertising monetization and user experience is still a common challenge for all AI video providers.

These real - world questions and strategic adjustments outline the trajectory of the AI video industry from "wild growth" to "rational development".

In Ding Yi's words, with the emergence of Sora2, the global AI video track is moving towards a higher - level competition stage. Model providers are competing more fiercely, and the training data covers various materials such as movies, animations, advertisements, and documentaries. "AI is learning the entire history of human images", and when technology reaches its peak, the competition will no longer be about algorithms but about creativity and implementation ability."

Twenty days after the release of Sora2, the world may not have changed immediately. But on the computers of every AI video creator, the way stories are generated has quietly changed.

This article is from the WeChat official account "Focus" (ID: dingjiaoone), written by Chen Dan and edited by Wei Jia. It is published by 36Kr with authorization.