Warum kann ich die vollständig Open-Source gemachte Plattform X von Elon Musk nicht nachbauen?

Musk, der sich in Peking befindet, hat Europa als Ziel.

Since Elon Musk made X open - source, people have always been complaining: "Elon Musk, you're unfair. You promised to make it open - source, but you didn't do it completely. Even if one has the code, one can't create one's own X platform."

But now it's possible. The page of X's open - source repository has undergone the biggest update in history. You can actually download the code and create your own X platform.

Elon Musk first released the code of X's recommendation algorithm on March 31, 2023. At that time, the platform was still called Twitter. The official website put the two repositories twitter/the - algorithm and twitter/the - algorithm - ml on GitHub and disclosed part of the recommendation logic behind the "For You" timeline.

But that was more of a "code transparency demonstration". The outside world could see the basic operation of the recommendation system, but couldn't access the training data, model weights, advertising recommendation system, and other key parts.

This time, Musk is serious.

Although X is not the biggest social platform in the world. It has 570 million monthly active users. The estimated income of X in 2026 is about $2.9 billion, which is 43% less than the $5.08 billion before Musk's acquisition. Before the acquisition, the advertising revenue accounted for up to 90% of X's income, and after the acquisition, it still accounts for less than 70% of the total income.

But it's still one of the most important social platforms in the world, a complete production system that processes 1.2 billion contents daily and serves 500 million users. Leading global AI companies like Anthropic and OpenAI use X as the first information distribution platform.

Less than 24 hours after Musk's post on X about the open - sourcing, X's open - source GitHub repository suddenly reached 20,000 stars.

Musk said in the open - source statement: "We know that this algorithm is quite stupid and needs urgent improvement. But at least you can see how we're working on improving it transparently and in real - time. No other social media company does this."

The recommendation algorithm is the core business secret of social media and the underlying logic that decides "what users see, what they're interested in, and what they buy".

So far, none of the mainstream platforms has been willing to fully disclose this logic.

Musk has done it.

01 What exactly is being open - sourced?

The core of the X algorithm being open - sourced this time is a Grok - based Transformer recommendation system.

The architecture of the whole system is not complicated, and the design is clear: It retrieves candidate contents from two sources, then sorts them with a machine - learning model, and finally filters out inappropriate contents before recommending them to users.

The two content sources are Thunder and Phoenix Retrieval.

Thunder is responsible for "In - Network" contents, that is, the posts published by the accounts you follow. It's an in - memory database that tracks the latest posts of all users in real - time and can achieve a response time in the sub - millisecond range.

When you refresh your feed, Thunder immediately pulls out the latest posts of the people you follow.

Phoenix Retrieval is responsible for "Out - of - Network" contents, that is, the posts you haven't followed but that the system thinks you might be interested in.

It uses machine learning to search for similar contents and finds posts from the global corpus that are related to your previous interactions. This is the most important part of the recommendation system and decides whether you'll see the successful posts of unknown accounts in your feed.

After the candidate contents from both sources are merged, they enter the unified sorting phase. The core of this phase is the Phoenix Scorer, a Grok - based Transformer model.

This model doesn't predict the "relevance", but the specific actions you might take for each content, such as the probability of liking, forwarding, replying, clicking, reporting, or blocking a post.

Each action has a weight. Positive actions (liking, forwarding) have a positive weight, and negative actions (reporting, blocking) have a negative weight. The final score is the weighted sum of all predicted probabilities.

Contents with a high score are displayed at the front, and contents with a low score are displayed at the back.

That's all.

X especially emphasizes in the open - source documentation: We have completely eliminated all manual feature - engineering methods and most heuristic rules.

The Grok - based Transformer model takes over all the arduous tasks. It understands your interaction history, such as what you've liked, replied to, or shared, and then automatically decides which contents are relevant to you.

This means that the previous operating strategies based on keyword stacking and tag - matching no longer work. The system now values semantic understanding more and can deeply analyze the actual value of contents and the real needs of users.

Although it's open - source, it's not completely so.

First, the model weights are not fully open.

The GitHub repository actually contains a pre - stored Mini - Phoenix model with 256 - dimensional embedding, 4 attention heads, and 2 Transformer layers, which is packed in a 3 - GB compressed package and distributed via Git LFS. This model allows developers to directly perform the end - to - end inference process without training it themselves.

But this is just a "mini - version". The Phoenix model actually used by X in the production environment is much larger, and the number of parameters, the number of layers, and the embedding dimensions are not in the same order of magnitude. The open - source mini - model is more of a teaching example to show you how the system works, but not the one that X actually uses.

It's like a small teaching engine that shows you the engine principles and actually runs, but it's not the real engine that X uses daily for the "For You" feeds of hundreds of millions of users.

The real production model is probably larger, more complex, has more training data, more parameters, and knows more user behaviors. Therefore, its accuracy, response speed, and ability to handle real - world traffic are not in the same order of magnitude as the mini - model.

Second, the training data is not open.

The core competence of the recommendation system lies half in the model and half in the data. X processes 1.2 billion contents daily and has collected a huge amount of user behavior data, such as who liked what, who blocked whom, what contents someone read when and for how long.

These data are the reason why the Phoenix model can accurately predict user behaviors.

But these data can't be made open - source. On the one hand, it's a privacy issue, and on the other hand, it's a business secret.

Without these data, you can't train a recommendation system as good as X even if you have the complete model architecture and code.

Third, only the framework of the advertising system is open - source, not the strategy.

This open - sourcing includes a new Ads module that handles ad insertion and placement, including brand safety tracking, and respects the boundaries of sensitive contents. But the specific logic of the advertising bid, the bidding strategy, and the ROI optimization algorithm, which are directly related to X's income, are not fully open.

Fourth, only part of the functions of the content understanding pipeline Grox (Grox is a Grok - based content understanding service in the X recommendation system) is open - source.

Grox is a newly added service that provides classifiers, embedding models, and a task - execution engine to perform content understanding tasks such as spam detection, post classification, and enforcement of PTOS policies. But how Grox exactly decides whether a content is spam, how it recognizes offensive contents, and how it enforces the platform policies is not fully transparent.

So, although you can create a social platform similar to X based on what's open - sourced on GitHub, you can't build a recommendation system as good as X.

You can get the complete system architecture, the candidate retrieval logic, the sorting framework, and the filtering rules and perform the end - to - end inference process. If you have enough engineering competence, you can actually build a similar recommendation system.

But you don't have X's data, X's production - ready model, and the engineering optimization and scheduling strategies collected by X in the past few years. Therefore, you can't replicate the X platform 1:1.

02 Why is it being open - sourced?

As early as October 2022, when he acquired Twitter, he publicly stated: "Open - sourcing the algorithm to increase trust" was one of his goals for this platform.

On March 31, 2023, Musk fulfilled his first promise. The X platform, still known as Twitter at that time, published on GitHub the source code of part of the recommendation algorithm, including the algorithm logic for tweet recommendations in the user timeline.

This open - sourcing attracted a lot of attention.

For the first time, developers could see the internal operation of the Twitter recommendation system and could confirm some long - standing rumors, such as that certain accounts were actually degraded by the algorithm and that certain content types were actually preferred for recommendation.

Musk said at that time that providing "code transparency" would be "incredibly embarrassing" at first but would ultimately "lead to a rapid improvement in the recommendation quality".

He also said: "Above all, we want to gain your trust."

But this open - sourcing was not complete. Most of the files in the GitHub repository were from the original upload, and there were only a few updates. Many developers complained that the code repository was not continuously maintained, the documentation was not detailed enough, and many key modules were not disclosed.

This time, Musk has obviously learned from the experience.

Interestingly, Musk was in Beijing when he posted the tweet about the algorithm update on X. But the actual target of this open - sourcing is Europe.

The X platform is facing increasingly strict regulatory reviews in Europe, and Musk uses "transparency" and "openness" as weapons to counter the regulatory pressure.

In July 2025, the French prosecutor's office launched an investigation against the X platform, suspecting that the algorithm was biased and made fraudulent data extractions.

The European Commission also issued an injunction to X and demanded that it provide contents related to the algorithm. The focus of the investigation is on the spread of false information, poor content moderation, and lack of information transparency.

The X platform rejected the investigation at that time and claimed that it was a "politically motivated criminal investigation" that threatened users' freedom of speech.

Musk even wrote an obscene word under a tweet of the European Commission.

But rejecting cooperation is obviously not a long - term solution. So, Musk made the algorithm open - source.

Instead of passively undergoing the review by regulatory authorities, he'd rather make the code public so that developers, researchers, and regulatory authorities around the world can see X's recommendation logic.

In this way, X can claim to be the "most transparent social platform in the world". Any accusation of algorithm bias or content manipulation can be answered with "The code is open - source, check it yourself."

Attack is the best defense.

Of course, open - sourcing also has its costs.

First, competitors can directly learn X's architecture designs and engineering practices. Now others can exactly examine how X performs retrievals, sortings, and diversity controls.

If some of X's designs are actually better than those of competitors, these designs will be quickly copied.

Second, the...

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Elon Musk hat die Plattform X vollständig Open-Source gemacht, aber ich kann sie nicht nachbauen.

01

What exactly is being open - sourced?

02

Why is it being open - sourced?