36Kr Exclusive: Four Key Propositions of ByteDance AI in 2026

World Model, Seedance, Coding, Doubao Commercialization

Text by | Zhou Xinyu

Edited by | Zhang Yuxin, Yang Xuan

According to exclusive information from multiple sources obtained by Intelligent Emergence, in 2026, ByteDance AI has four important propositions:

Increase investment in the training of world models. By the end of the year, the performance of the model should reach the level of the current world's state-of-the-art (SOTA) Google Genie 3.

Continue to maintain the leading position of the video model and explore new directions such as "dynamic generation".

Further strengthen the foundation of Coding, do a good job in Coding's Dogfooding (data backflow, evaluation, and form a flywheel) to improve Agent capabilities.

Enhance the commercialization capabilities of Doubao, with the key scenario being "office work".

ByteDance's Unfinished Business: World Models

Currently, in ByteDance's AI matrix, there is Seed 2.0, which has enabled ByteDance to enter the first echelon of large models in China, and Seedance 2.0, which has reached the world's SOTA level. In addition, on the application side, Doubao has also achieved a significant lead — we learned from multiple sources that after the Spring Festival in 2026, Doubao's DAU reached 200 million.

"There are no obvious weaknesses." An AI strategist from a large company evaluated ByteDance's AI business matrix in this way.

However, among all the models, the key to the next stage of large model research is missing: the world model.

Several people close to the Seed team told us that ByteDance is a latecomer in the world model track. In 2024, Zhou Chang, who had just joined ByteDance from Alibaba, took the lead in the research of world models.

At that time, the internal judgment was that the route and commercialization scenarios of world models were not clear, and it was more important to win the battle for video models.

It wasn't until 2025 that ByteDance established a research group on a small scale and began to explore the VLA (Visual-Language-Action model) route in world models. There are two leaders:

One is Li Hang, the head of ByteDance's AI Lab. In April 2025, the entire AI Lab (including the Robotics team) was merged into Seed. One of the purposes was to improve the communication efficiency between the model and the application (embodied intelligence), mainly training the world model based on simulation data.

The other is Wang Wenqian, a multi-modal researcher at Seed, who mainly conducts training based on natural data.

In 2026, Wu Yonghui finally set a clear goal for the world model at the Seed all-staff meeting: Before the end of 2026, at least one version of the world model should be released, with performance comparable to the current world SOTA (best) — Google's Genie 3, which was released in August 2025.

However, judging from the current progress, the speed of catching up is not fast enough. A person close to the Seed team told us that Wu Yonghui has repeatedly stated in internal Seed meetings that the effects of ByteDance's world model and embodied intelligence are not up to expectations.

Another Seed member revealed that according to internal evaluations, as of the beginning of 2026, there was still a 10% gap between the comprehensive performance of ByteDance's world model and the global SOTA.

But this battle represents the future.

On the one hand, downstream of the world model is the embodied intelligence market worth at least tens of billions of dollars, as well as game and entertainment scenarios with great imagination.

A former Seed researcher once told us that the previous implementation scenarios of ByteDance's robots were mainly item transportation and industrial handling, but the internal judgment was that the ceiling was relatively low. "Humanoid robots with a broader market prospect are a direction that ByteDance will definitely enter."

On the other hand, there are still many non-consensus views on the route of world models, including the video generation school, the VLA (Visual-Language-Action model) school, and the JEPA (Pixel Prediction) school.

"If we bet, with ByteDance's talent density and capital investment, we have a high probability of winning," an AI investor analyzed for us. "If we don't bet, we will definitely lose."

To achieve the goal of entering the world's first echelon, since 2026, ByteDance has also made many adjustments to the training of world models.

Intelligent Emergence learned that after the Spring Festival in 2026, Seed established a new world model research group, led by Fan Haoqi, a former researcher at Meta's FAIR Lab, who reports to Zhou Chang, the head of Seed's multi-modal and world model research.

Meanwhile, the two VLA research groups led by Li Hang and Wang Wenqian were merged and now report to Zhou Chang.

Multiple informed sources told Intelligent Emergence that the routes explored by the research groups led by Li Hang and Wang Wenqian were mainly VLA, pursuing "impromptu" and "real" effects, with the target application scenario being embodied intelligence. The new team led by Fan Haoqi takes the 3D simulation route, focusing on application scenarios such as entertainment and games.

In addition to the expansion of human resources and exploration routes, the world model also has the highest capital investment among multiple model directions such as text, Coding, and video.

This is particularly evident in the data budget. An employee of ByteDance's data platform told us that the strategy of "high-volume" training data has achieved significant results in LLM (Large Language Model) and Seedance 2.0. The team plans to apply the same "data ocean strategy" to the training of world models.

This also corresponds to higher data investment. We learned from multiple sources that in 2026, the budget for training data (including VLA, long videos, 3D and other modalities) allocated to the world model by ByteDance is the highest among all modalities, reaching tens of millions of yuan.

A data supplier mentioned that ByteDance's data investment in world models can reach 3-4 times that of other manufacturers.

Coding: Pursuing More Extreme Data Engineering

Coding ability is the foundation and the key to determining the upper limit of Agent effectiveness. This has become a consensus in the industry.

Multiple informed people have mentioned to us ByteDance's emphasis on Coding. "ByteDance has always invested heavily in Coding, second only to the world model this year," a person close to Seed told Intelligent Emergence.

For example, the company will purchase data in a targeted manner or study the training data demos of overseas top Coding models such as Claude Code and CodeX.

At the 2025 Volcengine Force Conference, Hong Dingkun, the vice president of technology at ByteDance, also said that as a highly structured and logically rigorous task, Coding has high requirements for the model to understand complex semantic structures, logical reasoning, algorithm design, and precise expression, which can help explore the upper limit of model intelligence.

However, outside the company, ByteDance's Coding business has not been very prominent. Whether it is the Doubao-Seed-Code model released in November 2025 or the AI programming tool Trae released at the beginning of 2025, their effects and popularity are not as good as those of Zhipu's GLM 5 and Yuezhianmian's K2.

"The reason why ByteDance's Coding effect is difficult to break through is the lack of data backflow." An informed person commented. Due to the limited capabilities of the model, relevant ByteDance businesses are reluctant to use Seed-Code.

Even the AI Coding application Trae was initially connected to DeepSeek, Claude Code, and the Coding model trained internally by the product.

This has led to a lack of feedback from real application scenarios for ByteDance's Coding model.

Since 2026, many ByteDance employees have felt that various business units are increasing their support for the Seed model. A Seed employee told Intelligent Emergence that previously, ByteDance did not restrict business units from using third-party Coding models for development, but since 2026, multiple application departments have been required to use the Seed model.

However, with more extreme data investment, the speed of Seed's talent recruitment has slowed down slightly.

An AI industry headhunter told Intelligent Emergence that ByteDance's HR is now sending the signal that the era of extensive high-salary recruitment is over. The next proposition is to cultivate and promote young talents internally and improve algorithm salaries.

Currently, the few recruitment openings at Seed are mainly open to AI talents from DeepSeek and overseas large companies such as OpenAI, DeepMind, and Meta, such as Guo Daya, a former core member of DeepSeek, and Dong Xin, a former NVIDIA researcher.

How Seedance Maintains Its SOTA Position

Another focus of ByteDance's AI models in 2026 is to maintain Seedance's SOTA position in the global video generation field.

"The victory of Seedance 2.0 is a victory of data," the founder of a video generation startup once told Intelligent Emergence about Seedance 2.0. We learned that a large amount of training data and an evaluation team of over 2,000 people have contributed to the outstanding performance of Seedance 2.0.

However, there are also concerns about the continuous "high-volume" training method. Some studies have shown that there is an "Anti-Scaling Law" phenomenon in the video generation field. Simply put, the more training data there is, the more likely the model is to "slack off", only learning certain key frames and ignoring the complete narrative. Therefore, the later the training stage, the lower the return on "high-volume" data is.

Two informed people on the data side told us that Seedance has reached the ceiling in pre-training. To improve performance in the future, it is necessary to clean the training data and conduct more refined post-training.

Meanwhile, "dynamic generation" ability is a new direction that the Seedance team is focusing on in 2026.

The so-called "dynamic generation", that is, interactive video, means that users can input instructions to adjust the content and plot of the video generation at any time. In this track, Vivix AI (founded by Liu Yu, a former senior research director at SenseTime), with a valuation of up to $1.32 billion, has emerged.

Multiple informed people told Intelligent Emergence that Zhou Chang has always been very optimistic about the implementation prospects of dynamic generation.

"Interactive videos can be made into small games or interactive dramas, and can also be connected to the exploration of world models (video generation is also an exploration path for world models)," a person close to Seed said.

Accelerating the Commercialization and Overseas Expansion of Doubao

36Kr reported exclusively that Doubao is expected to officially launch paid content in late June. Meanwhile, Doubao is also planning to connect with Douyin e-commerce to improve the paid scenario.

At the beginning of May 2026, Doubao updated its paid subscription plan in the App Store, with monthly subscription prices ranging from free to 500 yuan.

On June 3, Doubao officially announced that it will soon launch "Doubao Professional Edition" for the productivity needs of professional people, including professional services such as software development, data analysis, professional design, process automation, financial analysis, and scientific research.

Multiple informed people revealed that after the Spring Festival, Doubao's DAU has exceeded 200 million. "Doubao's advertising budget is very low this year," an informed person said. In his view, the high DAU brings high inference costs and operation and maintenance pressure. Doubao's promotion of commercialization at this time has the dual purposes of reducing the growth rate and self-sufficiency.

PPT generation is the core entry point for Doubao to establish users' paid mindset. "Doubao hopes to strengthen the PPT generation function in order to charge white-collar workers in high-net-worth industries such as finance and law," a person close to Doubao told Intelligent Emergence. In the next stage, Doubao also plans to launch an enterprise version to connect with the enterprise's internal system, but the specific combination method is still under discussion.

He said that this idea was inspired by the business models of overseas models. Currently, the commercialization path of charging for office scenarios has been verified overseas. According to data disclosed by Anthropic, just six months after the launch of Claude Code, its ARR reached $1 billion. One year after its launch, in February 2026, its ARR had reached $2.5 billion.

The considerable cash flow brought by Claude Code for enterprise development scenarios has also enabled Anthropic, which was established six years later than OpenAI, to surpass OpenAI's ARR at the beginning of this year.

Now, the problem that Doubao needs to solve is to transform users' mindset from regarding it as a "general entry point" where they can ask anything for free to an "office assistant" that can improve efficiency but requires payment.

However, the market that Doubao wants to enter is becoming crowded. A person from Doubao told Intelligent Emergence that during the process of researching enterprise customers, ByteDance found that the enterprise AI tool market has been occupied by many industry AI solution providers. As a latecomer, Doubao will inevitably face higher customer acquisition costs.

Intelligent Emergence learned that overseas expansion is also one of Doubao's important propositions this year.

Previously, according to a report by Jiemian News, the DAU of Doubao's overseas version, Dola, had exceeded 10 million by the end of 2025. Intelligent Emergence learned that Dola's growth target for 2026 is to reach 30 million DAU by the end of the year.

An informed person said that small-language countries are the main markets that Dola targets. Currently, the overseas AI Chatbot market is basically occupied by ChatGPT, Claude, and Gemini. Instead of competing directly with the "Big Three in AI" in the European and American markets, Dola's growth strategy is to enter the small-language market in a differentiated way.

Third-party data shows that since the second half of 2025, Dola has frequently appeared on the download lists of app stores in countries such as Indonesia, Malaysia, and Mexico.

——

Since joining ByteDance a year ago, Wu Yonghui's task has been to lead Seed to fix bugs while developing SOTA models. In 2026, in every AI battlefield, ByteDance's goal is to be the winner.

Now, the initial results of Seed 2.0 and Seedance 2.0 are showing. The engineering, data experience, and talents accumulated by Seed will also be reused in the new round of battles in a more efficient way.

(Deng Yongyi, the author of Intelligent Emergence, also contributed to this article.)

Cover source | AI-generated, Visual China

Welcome to communicate!

This article is originally produced by「阿菜cabbage」， For reprint or content cooperation, please click Reprint Instructions ；Unauthorized reprint will be held accountable.

36Kr Exclusive | Four Key Propositions for ByteDance AI in 2026

ByteDance's Unfinished Business: World Models

Coding: Pursuing More Extreme Data Engineering

How Seedance Maintains Its SOTA Position

Accelerating the Commercialization and Overseas Expansion of Doubao