StartseiteArtikel

Elon Musk macht selten Rückzieher: Er macht den Empfehlungsalgorithmus von 𝕏 Open Source und verspottet sich selbst, indem er sagt, er sei "schlecht", aber es werde zukünftig monatlich aktualisiert.

量子位2026-01-21 16:52
So funktioniert tatsächlich ein rein KI-gesteuerter Empfehlungssystem.

Already now, one can fully view the recommendation system of 𝕏 published by Elon Musk on GitHub.

The open - source files clearly indicate that it is an algorithm system almost completely driven by AI models.

We have removed all artificially designed features and most heuristic rules.

As soon as the news came out, the entire community was in an uproar, and the most highly praised comments were full of praise:

Incredible! No other platform is so transparent.

Elon Musk himself quickly retweeted the original post by the 𝕏 engineering team. But the otherwise outspoken Musk was modest this time:

We know that this algorithm is stupid and needs to be greatly improved, but at least you can see in real - time and transparently how we are working to improve it.

No other social - media company does this.

Even before the acquisition of 𝕏 (formerly Twitter) in 2022, Musk repeatedly criticized that the platform was too closed.

Since the acquisition, he has kept his promise and has published the core recommendation algorithm of Twitter several times. So this time, he remains true to his goal.

This is how a purely AI - driven recommendation system works!

Without further ado, let's take a look at how the entire system works.

The system can be summarized in one sentence:

It is based on the same Transformer architecture as Grok - 1 and can decide which content to recommend to you by learning your historical interaction behaviors (e.g., what you have liked, replied to, or retweeted).

As soon as the user opens the "For You" page, the client sends a request to the server, which starts the entire algorithm process.

Then the system first does something – it determines who you are, what you have recently done, and what kind of content you usually react to.

To achieve this, the system extracts two types of user information:

Action Sequence: This is the most direct and strongest indication of interest, e.g., what you have recently liked, replied to, retweeted, clicked on, or left on the screen.

Features: This represents long - term attributes, such as your follower list, your specified areas of interest, your geographical location, and the device you are using.

The goal of this step is not to artificially construct features, but to create a "real - time user profile" as realistically as possible

In the past, engineers might have assumed that "certain attributes are very important" and then manually written rules or formulas to calculate a "user interest score".

But this was essentially an assumption by the engineers and not a reflection of the actual state of the users.

So Musk's algorithm has decided not to make any preconceived assumptions, but to collect as many original user reactions as possible and give these data directly to the subsequent model so that the model can learn from the original data and discover patterns itself. (That is, "de - humanization" and "end - to - end")

After the real - time user profile is created, the system splits into two paths and quickly filters out a few thousand "potentially relevant" tweets from the millions of tweets across the platform.

One path is through your circle of acquaintances. The Thunder module directly pulls the latest tweets from all the people you follow.

The other path is through external sources. The core search module, Phoenix Retrieval, pulls tweets from accounts you don't follow but might be interested in.

The two different information sources are treated uniformly in the following phases.

It should be noted that the filtered entries are initially just tweet IDs.

The system then supplements the information for each candidate tweet via the Hydration module, including the full tweet text, author information, images/videos, and historical interaction data, in order to conduct a in - depth evaluation.

And before the actual calculation begins, the Filtering module excludes content that is obviously unwanted, such as:

Repeated or expired posts

Content you have published yourself

Posts from blocked or muted accounts

Content containing keywords you have blocked

Posts you have already seen or that have been displayed in the current session

Subscription content you don't have access to

Remember that this step has only one task: to decide whether a piece of content "can appear", not whether it is worth recommending.

The remaining content is finally individually input into the Phoenix sorting model for evaluation.

This model is based on Transformer technology and simultaneously receives:

The user's action sequence and features

The content and author information of a single candidate post

Then the model predicts the probability that the user will perform certain actions in relation to a tweet and combines these probabilities according to predefined weights (e.g., positive actions like liking increase the score, negative actions like blocking decrease it) to form a final sorting score.

Based on this score, the system makes a few small technical adjustments –

For example, it controls the diversity of authors to avoid a single account having too high a share in the news feed (to prevent an influencer from flooding the feed).

It should also be noted that the system wants to ensure that each post is evaluated independently. Therefore, it was specifically stipulated that candidate posts cannot "see" each other (there are no cross - attention mechanisms between the tweets).

All candidate posts are sorted according to the final score, and the system selects the top - K posts to serve as the recommendation result for this request.

And before the results are sent back to the client, the system conducts a final check to ensure that the content complies with the platform's safety guidelines –

For example, all deleted, marked - as - spam, or tweets containing violent or blood - thirsty content are removed.

Finally, the information remaining after all these filters is displayed to the client user one by one according to their score.

In summary, the five key factors for the successful operation of this system (official focus) are:

(1) Purely data - driven, no manual rules.

The complex rules defining what a "good" piece of content is are completely discarded, and the AI model instead learns directly from the original user data.

(2) Use of a candidate isolation mechanism for independent evaluations.

When the AI model evaluates content, each piece of content cannot see other candidate content, but only the user information. This ensures that the score of a post is not changed by other posts in the same round, so that the scores can be cached and reused consistently and efficiently.

(3) Hash embedding for efficient search.

During the search and sorting process, multiple hash functions are used for vector embedding search to improve efficiency.

(4) Prediction of multiple behaviors instead of a single score.

The AI model does not directly output an unclear "recommendation value", but predicts multiple user behaviors simultaneously.

(5) Modular pipeline for quick iterations.

The entire recommendation system is modularly structured, so that the individual components can be developed, tested, and replaced independently of each other.

"Yes, this algorithm is terrible"

Although people admire Musk's transparency, this algorithm still has some "flaws".

A user commented after the publication of the 𝕏 recommendation algorithm:

Due to limited API accesses and high costs, blocking accounts has become rare these days, but it used to be very common.

The algorithm must ensure that older block lists fade over time so that they cannot be misused anymore.

That is, the algorithm code shows that "blocked by many users" is a strong negative indicator that causes an account to be "downgraded", i.e., its content is recommended less. But there is no time - decay mechanism for the "blocking" signal in the code.

This means that old blocking histories may still affect an account's recommendation score.

These statements have even made Musk himself grumble in the comment section:

Yes, this algorithm is terrible.

But no matter what, Musk's will to change things is clear –

He has opened up in the past and now, and will continue to do so in the future. In the future, he will publish a new open - source update every 4 weeks.

Open - source repository: https://github.com/xai-org/x - algorithm

Reference links: [1]https://x.com/elonmusk/status/2013482798884233622[2]https://x.com/elonmusk/status/2013496642851279270

This article is from the WeChat account "QbitAI", author: Focus on cutting - edge technology, published by 36Kr with permission.