At three o'clock in the morning, I'm completely sleepless: Seedance 2.0 tells us that AI's "compression" of real-world workflows is accelerating.
At three o'clock in the morning, I watched the video of Byte is Dream Seedance 2.0 updated by Tim of Yingshi Jufeng. I couldn't fall asleep at all.
This is the first time in the past year or so that the progress of AI has made me so excited. Or rather, shuddered.
Many people are waiting for the GPT - 3.5 moment in the video field. They think it will still take two or three years. Seedance 2.0 tells us that it is already within reach.
Its strength lies in that it has AI - ized all aspects of shot composition, storyboarding, and audio - visual matching, and done it very well. It understands light and shadow, perspective, and shot language.
What Tim showed in the video is control, the perfect replication of the physical world by AI.
The logic of AI is becoming clear and simple. AI is crazily compressing our workflow: from directing, shooting to editing, and music scoring; from product management, development to testing, and delivery.
All intermediate links are being gradually compressed.
In this article, I want to talk about how AI is changing the workflow and how it is reconstructing our work.
01
The GPT - 3.5 Moment in the Video Industry
I can totally relate to the uncontrollable excitement Tim showed in the video.
Previously, we thought that shot composition was the privilege of the physical world, such as sliders, jibs, drones, and Steadicams. These devices are expensive, and the people who operate them are even more expensive.
Seedance 2.0 has turned all these into parameters. In the video demonstration of image - to - video, you only need a photo of the protagonist + a photo of the scene.
It can make the protagonist move in the scene according to the shot composition method you specify, and the consistency of multiple subjects is amazingly maintained.
For zooming in and out, panning, and tilting, in the past, it was necessary to lay tracks and have the lighting technician adjust the lighting position every second.
Now it's just a line of text in the prompt. The physical limitations of the physical world have been replaced by the parameter limitations of the mathematical world.
Seedance 2.0 seems to understand the consistency of three - dimensional space.
It knows how the background objects should produce parallax when the camera moves to the left. It knows how the length of the shadow should change when the light shines from the right.
Seedance 2.0 has started to get involved in editing. AI can understand the rhythm of the video, identify the emotional high points in the picture, and automatically match the beats of the music.
For editors, the "rough editing" work that used to take several hours may now only take a few seconds.
The same goes for sound. In the picture of a basketball court, the complex sounds of the stadium appear synchronously.
This consistency in perception is an important basis for the human brain to judge "reality", and AI has achieved it.
Video post - production was originally an extremely complex systematic project. The director is responsible for conceiving, the photographer is responsible for transforming the concept into light and shadow, the editor is responsible for recombining the light and shadow into a narrative, and the music scorer is responsible for arousing emotions with sound.
This is an extremely expensive, inefficient, and friction - filled linear workflow. Seedance 2.0 has broken this chain and compressed all these jobs into one model.
In essence, what AI is doing now is to continuously compress our various workflows.
From Seedance 2.0, we can see the prototype of AI compressing the workflows of directors, shooters, editors, and music scorers.
The GPT - 3.5 moment in the video field has arrived.
In the next two or three years, it will be the time for industry reshuffle, and the old order is collapsing.
02
AI is Extremely Compressing Our Workflow
The transformation in the video field is just one aspect of AI reshaping the workflow. A more profound transformation is taking place in the software field, on our mobile phone screens.
Recently, I used Tongyi Qianwen from Alibaba to order a cup of milk tea. This experience made me think a lot.
It may indicate the end of the App era, or rather, the arrival of the "instant software" era.
Our current Internet experience is locked by the form of "App".
If you want to order a cup of milk tea, you need to unlock your phone, find the food delivery App, click to enter, wait for the splash ad, click on the search box, enter "milk tea", screen through dozens of merchant lists, click to enter the merchant page, choose from dozens of products, select the sweetness and ice level, click to place an order, and make a payment.
This is an extremely long process.
Why do we have to go through this process? Because the App is trying to meet everyone's needs. It is looking for the greatest common divisor. It has to stuff low - frequency needs into secondary pages, and it has to add various recommendations for commercialization.
For me, I don't need these. I usually order from those 3 stores. I know which store has the best lemon tea and which store has the cleanest kitchen.
All I need is: "Help me order a cup from my usual store, sugar - free."
Tongyi Qianwen's current capabilities are approaching this ideal state.
You give it a command, and it directly calls the interface in the background through code and agents to complete the delivery.
This is the "intention interface". You output your intention, and AI delivers the result. All the intermediate UI, interactions, and jumps are compressed.
When the capabilities of AI evolve from "Vibe Coding" proposed by Andrej Karpathy to a powerful enough agent, each of our needs will be delivered through an instantaneously generated "one - time App".
The traditional long chain of "product manager requirement document - development to write code - testing to find bugs - final delivery", which takes weeks or even months, will be instantly compressed by AI to within 1 minute.
This raises a fundamental business question: Since I can generate an "App" in 1 minute to meet my current needs, why do I still need to download an App of hundreds of megabytes?
The existing App ecosystem has insurmountable structural contradictions. Everyone's needs are unique. AI can directly transform users' natural language needs into delivery results through instant code.
In fact, AI has customized a "dedicated App" for users, which can be used and then discarded without retention.
This is a huge challenge for current Internet giants. Their moats are built on the installation volume of Apps and the user's usage time.
If Apps disappear and the entry becomes an AI agent, where will their traffic come from? Where will they place their ads?
The entry for the next era may gradually become clear.
The reason why all major companies are frantically developing large models and competing for the only "super agent" is obvious.
For many products in the form of aggregated needs like Apps, will they turn into products within AI that meet personalized needs in the AI era?
Current App developers may become "data API service providers". As the delivery chain is greatly compressed and the cost is reduced, App requirements actually become API requirements.
Every product dialogue is a result delivery for oneself as a product manager.
Ultimately, the disappearance of the traditional workflow means the disintegration of the company organization.
The company as an organizational form essentially exists to reduce transaction costs. Because communication is expensive and trust is expensive. So we have to gather people together, sign contracts, and pay salaries.
When one person + AI can complete what used to require a team to do, a large organization becomes unnecessary. We will see more and more "one - person companies"...
Looking at it this way,
I believe that the change of the world by AI is accelerating.
This article is from the WeChat official account "Hard AI". Author: Xiaoxiaomao. Republished by 36Kr with permission.