Marc S. Weidner

Published: 24. April 2026
Updated: 25. April 2026
Written by: Marc Weidner

The Prediction Feed

How X Turned the Social Graph into a Guessing Machine

Prologue
From social graph to prediction graph
What Phoenix changes
The Weighted Scorer and the missing weights
The prediction trap
Follower devaluation without follower irrelevance
Author diversity and the volume trap
Retweets, reposts and the community problem
The 48-hour horizon
Candidate isolation and the death of context
The consumer-feed defence
Algorithmic suppression without a censor
What the code proves, what it does not
What a real correction would look like
Support Note
Sources
Changelog
- Post-Publication Technical Addendum

X still displays follower counts as if they were social capital. The product interface continues to show the old public token of audience: this many people chose to subscribe to this account, to hear from this person, to let this voice into the stream. Yet the newer For You architecture described in xai-org/x-algorithm increasingly behaves as if that explicit act of following were not the central fact at all. The central fact is prediction.

That is the contradiction at the centre of the current creator revolt on X. @blac_ai gives the code-facing diagnosis: a creator with more than 60,000 followers, a reported reach collapse, and an agent-assisted reading of the open-source repository that points to Phoenix, Thunder, prediction heads, author diversity decay, repost deduplication and a 48-hour freshness horizon. @ianmiles gives the platform-facing diagnosis: a larger creator reporting a claimed 95 percent reach collapse, the devaluation of followers, niche confinement, a polluted For You feed, and the conversion of a social network into what he calls a slot machine.

Neither statement is, by itself, technical proof. Creator testimony is evidence of experience, not evidence of code. But the public code matters because it allows us to test whether these experiences are merely nostalgia, paranoia and platform melodrama, or whether they are structurally intelligible under X’s newer recommendation architecture. The answer is uncomfortable. The code does not prove every strong claim made by either creator. It does not prove a conspiracy against artists. It does not prove a deterministic death spiral. It does not prove that every repost is hit with a 90 percent penalty. It does not prove that every post everywhere on the platform dies after 48 hours.

It does show something colder and more consequential: X has built a prediction-first allocation system in which explicit social commitment is demoted beneath predicted behavioural yield.

From social graph to prediction graph

The old bargain of a social platform was simple enough for ordinary human beings to understand, which naturally made it vulnerable to replacement by something more elaborate and less accountable. A user follows an account. The account builds an audience. That follower relationship has distributional meaning. Not absolute meaning, not guaranteed meaning, not a promise that every post reaches every follower, but meaning nonetheless. The follow button was not decoration. It was the basic unit of the social graph.

The newer public X repository describes a different centre of gravity. Its top-level README says the For You feed retrieves, ranks and filters posts from two sources: In-Network content from accounts a user follows, supplied through Thunder, and Out-of-Network content discovered through Phoenix Retrieval from a global corpus. These sources are then combined and ranked using Phoenix, a Grok-based transformer model that predicts engagement probabilities for each post. The final score is described as a weighted combination of those predicted engagements.¹

The old social graph is not gone. That would be too crude. Thunder still serves recent in-network posts from accounts the requesting user follows.² Query hydration still fetches the user’s recent engagement history and metadata, including the following list.³ The follow relation still helps decide what can enter at least one candidate source. But the key shift is this: entering the candidate pool is not the same thing as being seen. Once candidates enter the pipeline, they are filtered, scored, selected and post-processed. The public architecture presents a system in which the explicit follow relation is only one input into a larger allocation process dominated by predicted engagement.

That is the source of the creator complaint. Not that follows literally do nothing. They plainly still do something in the in-network retrieval path. The complaint is subtler and more serious: the follow no longer appears to function as a strong distributional promise. It has become an upstream eligibility signal inside a prediction machine.

This distinction matters because @blac_ai’s strongest phrase, “your follower count is just a display number”, is technically too absolute but substantively meaningful. The visible code does not show a simple follower-count multiplier inside the WeightedScorer. The candidate structure includes author_followers_count, and author metadata is hydrated, but the scoring path visible in weighted_scorer.rs combines Phoenix prediction fields rather than directly multiplying reach by follower count. At the same time, the following relationship remains relevant to Thunder’s in-network candidate sourcing. So the precise formulation is not “followers do not count”. It is: follower count, as a public quantity of accumulated audience, appears de-emphasised relative to predicted engagement in the visible For You ranking path.

That is enough to explain why creators feel robbed. A follower count can remain visible while losing much of its practical force. The platform can still show the trophy while melting down the metal.

What Phoenix changes

Phoenix is the hinge of the new system. Its README describes a two-stage recommendation architecture. Retrieval narrows millions of candidates to a smaller set using approximate nearest-neighbour search. Ranking then orders those candidates using a transformer model. The retrieval side uses a two-tower model: a user tower encodes user features and engagement history into an embedding; a candidate tower computes embeddings for items in the corpus; similarity search retrieves top candidates by dot product.⁴

The ranking side is more important for the present dispute. Phoenix uses a transformer with candidate isolation. Candidates can attend to the user and the user’s history, but not to each other during inference. The documentation says this makes a candidate’s score independent of the other candidates in the batch.⁵ In product terms, this means Phoenix is not first asking, “Did this user choose to follow this creator?” It is asking something closer to: “Given this user’s history, what actions is this user likely to take on this candidate?”

The top-level README lists fifteen predicted action probabilities: favourite, reply, repost, quote, click, profile click, video view, photo expand, share, dwell, follow author, not interested, block author, mute author and report.⁶ The implementation goes further. The PhoenixScores structure visible in the repository includes fields such as favorite_score, reply_score, retweet_score, photo_expand_score, click_score, profile_click_score, vqv_score, share_score, share_via_dm_score, share_via_copy_link_score, dwell_score, quote_score, quoted_click_score, follow_author_score, not_interested_score, block_author_score, mute_author_score, report_score, and continuous dwell time.⁷ This is where the creator-thread number of “19 prediction heads” becomes intelligible. The README summary lists fifteen action categories. The scoring implementation uses a richer set of score fields.

This does not mean @blac_ai’s every interpretation follows automatically. It does mean the core mechanism is real: Phoenix predicts multiple kinds of user action, and the For You system ranks candidates according to a weighted combination of these predictions.

The Weighted Scorer and the missing weights

The WeightedScorer is the practical bridge between prediction and distribution. Its job is not mystical. It takes Phoenix scores and turns them into a single ranking score. In weighted_scorer.rs, the code obtains PhoenixScores from a candidate and applies weights to the available score fields. The visible combination includes favourite, reply, retweet, photo expansion, click, profile click, video-quality-view-like scoring, share, share via DM, share via copied link, dwell, quote, quoted click, dwell time, follow author, not interested, block author, mute author and report.⁸

That is a crucial finding. It directly supports the claim that the system does not merely ask whether a post belongs to a followed account. It predicts behaviour and converts those predictions into ranking priority.

But there is an equally crucial limit. The weights come from params. The open-source release does not expose params. In home-mixer/lib.rs, clients, params and util are marked as excluded from the open-source release for security reasons.⁹ That means one may safely say that replies, retweets, photo expansions, clicks, dwell, reports and other signals are weighted and combined. One may not safely say, from this repository alone, that replies are definitely weighted more heavily than likes in production, or that a given action has a precise numerical advantage, unless those weights are independently published in a primary source.

This is where a lot of public algorithm commentary goes intellectually feral. It sees a mechanism and immediately promotes it into a final causal law. The visible code shows the structure of weighted prediction. It withholds the constants that determine the relative force of each prediction. That omission matters.

It also matters that normalize_score comes from util, which is likewise excluded.¹⁰ The visible formula is not the whole production story. The ranking architecture is visible. The exact runtime tuning is not.

The strongest evidence-based statement is therefore this: X’s public For You code ranks candidates by combining multiple predicted user actions into a final score, with positive actions such as favourite, reply, retweet, share and dwell entering alongside negative actions such as not interested, block, mute and report. The public repository does not reveal the full production weights or all normalisation logic.

That is already enough to sustain the central critique. The system is not primarily distributing according to earned audience. It is allocating according to predicted behavioural yield.

The prediction trap

@blac_ai calls this the “Prediction Trap”. The phrase is rhetorically heated, but the underlying concern is not absurd. A prediction-first feed can create a feedback loop.

The visible architecture supports the first half of that concern. Before a candidate is selected for display, Phoenix predicts the likelihood of various engagements. WeightedScorer combines those predictions. A selector then chooses top candidates by score.¹¹ A post with lower predicted engagement can therefore receive lower exposure, simply because the model expects less from it. Lower exposure then means fewer opportunities to gather observed engagement. If future user-action sequences and historical engagement patterns influence later predictions, weak distribution can become weak evidence, and weak evidence can become lower distribution.

That loop is not unique to X. It is a general pathology of recommender systems. But X’s public code makes the concern especially pointed because the system explicitly describes the final score as a weighted combination of predicted actions.¹² The platform is not merely ranking what users already chose to follow. It is forecasting what they might do, then allocating visibility accordingly.

However, the deterministic version of the claim is not proven by the code. The repository does not, by itself, show the full exploration policy, live-feedback update cadence, A/B testing configuration, cold-start protections, production thresholds or model retraining loop. The code makes the feedback-loop concern structurally plausible. It does not prove an unavoidable death spiral for any given account.

This distinction is not pedantry. It is the boundary between analysis and incantation. @ianmiles says large accounts are watching reach collapse month over month. @blac_ai says a declining account may become trapped because Phoenix predicts continued underperformance. Those experiences may be accurate. They are certainly coherent with the architecture. But to prove the magnitude, one would need platform-side data: impressions by source, follower-only impressions, out-of-network impressions, score distributions, candidate rejection reasons, and longitudinal model behaviour across accounts.

The code explains why creators are anxious. It does not, alone, quantify the abyss.

Follower devaluation without follower irrelevance

@ianmiles’ central claim is not a line of code. It is a political economy claim about platform value: followers used to be social capital, and the algorithm has retroactively devalued every account. That claim is emotionally loaded, but it captures a real architectural distinction.

In the public repository, Thunder supplies in-network posts from accounts a user follows.¹³ That means the follow relation remains a retrieval mechanism. It is not meaningless. A creator without followers is not in the same starting position as a creator with followers, because the latter can enter the in-network candidate path for those followers.

But a retrieval mechanism is not a distribution guarantee. Once candidate sourcing is complete, the pipeline applies filters, Phoenix predictions, weighted scoring, author diversity scoring, out-of-network adjustment, top-K selection and post-selection validation.¹⁴ The follow relation may get a post into the room. It does not necessarily get the post onto the stage.

This is where @blac_ai’s phrase becomes useful despite its overstatement. “Follower count does nothing” is too crude. “The follow no longer means what creators thought it meant” is harder to dismiss. The explicit social choice survives, but it is subordinated to a model’s estimate of future behaviour. That is a profound shift in platform governance.

The difference is not cosmetic. A social-graph-first network asks users to build their own information environment. A prediction-first feed treats those user choices as useful but defeasible signals inside a behavioural optimisation system. The user says, “I chose to follow this person.” The system replies, “Noted. Now let us calculate whether showing them to you is worth it.”

That is the demotion at issue.

Author diversity and the volume trap

The most directly code-supported part of @blac_ai’s volume argument is the AuthorDiversityScorer. Its file-level comment says it diversifies authors served within a single feed response.¹⁵ The scorer sorts candidates by weighted score, counts how many times each author has already appeared, and applies a multiplier based on the formula:

(1.0 - floor) * decay_factor.powf(position as f64) + floor

This is exponential decay. A creator’s first candidate in the ordered list has one multiplier; the second candidate by the same author has a lower multiplier; subsequent candidates are attenuated further, bounded by the floor. The scorer is then inserted into the Phoenix candidate pipeline after PhoenixScorer and WeightedScorer, and before OONScorer.¹⁶

That directly supports one mechanism behind the “posting too much hurts” claim: multiple posts from the same author can compete with one another inside the same feed response. The code does not merely say “diversity” in a product-manager haze. It implements repeated-author attenuation.

But the creator-thread version again overreaches in places. The code shows decay within a candidate list or feed response, not a proven permanent account-level throttle. The position is the count of that author within the set being scored. The concrete AUTHOR_DIVERSITY_DECAY and AUTHOR_DIVERSITY_FLOOR values come from params, and params is excluded.¹⁷ The public code therefore shows the shape of the mechanism but not its production severity.

The “posting too little” side is even less directly evidenced. I do not see an explicit visible inactive_author, low_activity_author or dormancy penalty in the cited public code. It is plausible that sparse posting weakens the available user-action sequence, reduces fresh candidate availability, and gives the prediction system fewer positive signals. That is a reasonable inference. It is not a direct code finding.

The result is a nasty asymmetry. Frequent posting can be structurally punished within a feed response by author diversity attenuation. Infrequent posting may be punished indirectly by freshness bias and weak engagement history. The first is visible in code. The second is a plausible creator-side interpretation requiring empirical validation.

Still, the lived effect can be exactly as creators describe: post too often and your own posts collide; post too rarely and the system has little fresh material or recent signal to allocate. The safe band may exist in practice. The public repository does not tell creators where it is. Charming. A platform can turn posting into applied superstition and still call it optimisation.

Retweets, reposts and the community problem

@blac_ai’s claim about reposts is the most fragile part of the technical thread. The public code clearly contains deduplication. It does not, in the inspected repository, prove a 90 percent impression penalty for reposting other artists’ work.

RetweetDeduplicationFilter deduplicates retweets by keeping only the first occurrence of a tweet, whether as original or as retweet. The filter tracks seen tweet IDs. If it encounters a retweet of a tweet already represented, it removes the duplicate. If it sees an original, it marks that original tweet ID as seen so retweets of it can be filtered later.¹⁸ This is removal as duplication control. It is not a visible numeric downweight of 90 percent.

PreviouslySeenPostsFilter also matters. It uses Bloom filters and directly provided seen IDs from the request to remove posts the user has already seen or related posts whose IDs match that seen logic.¹⁹ This supports the existence of Bloom-filter-based seen filtering. It does not prove the richer story that self-reposts are generally viable while reposts of others are algorithmically buried with a specific numeric penalty.

The distinction is large. Deduplication says: do not show the same underlying thing again, or at least do not show it repeatedly in the same context. A 90 percent reach penalty says: show it, but suppress its distribution by a defined fraction. Those are different mechanisms. Conflating them is analytically sloppy.

There may be external X policy around aggregators or repost monetisation. That would need a separate primary source. It is not proven by retweet_deduplication_filter.rs. Based solely on this repository, the defensible claim is narrower: reposts and retweets are subject to deduplication and previously-seen filtering, and those mechanisms can reduce the circulation of reshared material. The stronger claim of a repository-proven 90 percent penalty is not supported.

Even so, the community problem remains real. Creator networks are not only made of original posts. They are made of endorsement, amplification, mutual visibility and shared audience formation. A system that aggressively deduplicates already-seen or related reposted content may improve feed cleanliness for passive consumers while weakening the social rituals by which creator communities sustain one another. No numeric penalty is required for that effect. A binary filter is quite capable of doing cultural damage. Efficiently, even. Silicon Valley enjoys nothing more than collapsing a social practice into a Boolean.

The 48-hour horizon

Thunder is the strongest evidence behind the 48-hour claim, and also the place where the claim must be narrowed.

The top-level README describes Thunder as an in-memory post store and real-time ingestion pipeline. It tracks recent posts from all users, consumes create and delete events, maintains per-user stores for original posts, replies or reposts, and video posts, serves in-network candidates from accounts the requesting user follows, and automatically trims posts older than the retention period.²⁰

The PostStore implementation contains a retention_seconds field. When posts are inserted, the code retains only posts created within the retention window and not from the future. The trim_old_posts() function removes posts older than that period from the user maps. The default implementation sets retention to 2 * 24 * 60 * 60, which is 48 hours.²¹

This directly supports a limited version of @blac_ai’s claim: Thunder’s in-memory in-network recent-post store has a 48-hour default retention horizon. Older posts are not merely lower-ranked there; they are filtered or trimmed from that store.

But the global claim is overstated. Thunder is one source. Phoenix Retrieval is described separately as ML-discovered posts from a global corpus.²² The code and README do not establish that search, profiles, conversation views, quote chains, external links, notification contexts or every out-of-network corpus path forget posts after 48 hours. Nor does the inspected code prove that Thunder auto-trims every two minutes. The function start_auto_trim(interval_minutes) exists, but the concrete interval is supplied as an argument; the default two-day retention is visible, the two-minute interval is not established by the same evidence.²³

The fair conclusion is severe enough: X’s main in-network recent-post path is structurally freshness-biased. The system’s visible design favours recent candidates. For creators, that means older work may quickly fall out of the central in-network recommendation machinery, even if it still exists elsewhere on the platform. That creates the sensation @blac_ai describes: one is constantly starting again, constantly seeking fresh prediction, constantly re-entering the lottery.

Again, the code does not prove universal disappearance. It proves a short horizon in a major candidate source. That is not nothing. For many creators, it may be the difference between a durable audience and a treadmill.

Candidate isolation and the death of context

One of the more subtle design decisions in Phoenix is candidate isolation. During transformer inference, candidates cannot attend to one another; they can attend to the user and history, but only to themselves among candidates.²⁴ The documentation frames this as a consistency feature: the score for a post does not depend on which other posts happen to be in the batch.

Technically, that is elegant. Product-wise, it is defensible. It makes scores more stable and cacheable. But it also reveals how the system thinks. Each post is evaluated as a candidate response to a behavioural prediction problem. It is not primarily evaluated as a piece of an ongoing relationship between creator and audience. The candidate becomes a prediction object.

This is where @ianmiles’ “niche prison” complaint becomes intelligible. The code does not literally say “punish people for having multiple interests”. But a model trained on engagement histories and optimised around predicted action probabilities will tend to reward consistent, legible patterns. An account that posts only one kind of content produces a cleaner behavioural signature. An account that behaves like a human being, with politics, technology, gaming, culture, memes and personal fragments, becomes harder to predict. In a prediction-first allocation system, human topical range can degrade model confidence or reduce predicted engagement.

That is not direct proof of niche punishment. It is a strong structural inference. The architecture privileges legibility. A narrowly defined content category is legible. A person is noisy. The model may not hate human variety. It simply has no institutional reason to honour it.

This is the coldness of the thing. The system does not need ideology to flatten creators. It only needs an objective function.

The consumer-feed defence

A fair analysis has to grant the strongest argument for X’s architecture. Passive consumers do not want a chronological dump of everything they follow. Discovery is valuable. Spam is real. Repetition is irritating. Reposts can become clutter. Muted accounts and keywords should be respected. Posts already seen should not reappear endlessly. A model that predicts engagement from user history can improve short-session satisfaction. A For You feed that never leaves the follow graph becomes stale. A platform without out-of-network discovery becomes a private newsletter reader with arguments.

From that perspective, Thunder plus Phoenix is not irrational. Thunder supplies recent in-network material. Phoenix retrieves out-of-network candidates. Home Mixer filters duplicates, old material, self-posts, muted content, previously seen posts and visibility-filtered material. Scorers rank candidates. Selectors choose top posts.²⁵ This is coherent product engineering.

The structural flaw is that a platform is not only a consumption interface. It is also a creator economy, a relationship graph and a reputational system. The audience a creator builds is not incidental waste heat. It is the reason the platform exists.

A prediction-first feed can optimise the next session while liquidating the social capital accumulated across years. It can maximise behavioural yield while eroding trust. It can improve novelty while degrading loyalty. It can make the feed feel alive while making creators feel dispossessed. It can make the platform more efficient as an attention allocator and less legitimate as a social network.

This is the point @ianmiles makes in political language. Strip away the theatre, and the structural claim remains: followers used to be a durable asset; now they are a weaker input into an opaque allocation system. That is not merely a creator vanity issue. It is a governance issue. A platform that invites people to build audiences and then subordinates those audiences to a hidden prediction function has changed the terms of participation without fully saying so.

Algorithmic suppression without a censor

The phrase “algorithmic suppression” is often abused. It can become a lazy substitute for evidence: my post did badly, therefore I was suppressed. That is not good enough. But it is equally lazy to say suppression only exists when a human censor presses a red button. Ranking is allocation. Allocation is power. A feed that chooses not to show followed accounts can suppress them in practical effect without ever intending to censor them in motive.

The public X code supports the existence of many allocation mechanisms: age filtering, deduplication, previously seen filtering, author social graph filtering, visibility filtering, predicted engagement scoring, author diversity attenuation, out-of-network scoring and top-K selection.²⁶ Some of these are ordinary hygiene. Some protect user experience. Some are necessary. But together they form an opaque distribution regime.

This is why creator testimony matters. It is not technical proof. It is symptomatology. @blac_ai says artists once visible in the feed now seem to disappear despite still posting. @ianmiles says the For You feed increasingly shows random accounts while people he chose to follow are buried. These claims require measurement. Yet the architecture makes them plausible: the follow relation is not the final authority; predicted engagement is.

A society of creators can tolerate moderation rules, even harsh ones, better than it can tolerate unknowable allocation. The first at least gives you something to contest. The second turns visibility into weather.

What the code proves, what it does not

The code directly supports several claims. X’s new public For You architecture combines in-network Thunder content with out-of-network Phoenix Retrieval. Phoenix predicts multiple user actions. WeightedScorer combines Phoenix scores into a weighted ranking score. The concrete weights are hidden in excluded params. AuthorDiversityScorer applies an exponential multiplier to repeated authors within a single feed response. Thunder’s PostStore has a 48-hour default retention. Retweets are deduplicated. Previously seen posts are filtered using Bloom-filter-based logic. The system is prediction-first in a meaningful technical sense.

The code partly supports other claims. Followers are devalued in the final allocation logic insofar as follower count is not visibly a direct scoring multiplier, but the follow relation remains central to Thunder in-network retrieval. Repeated posting can cause intra-author competition within a feed response, but the public code does not prove a permanent account throttle. Older posts leave Thunder’s recent in-network store after the retention window, but the public code does not prove global invisibility across the platform.

The code does not support several public overstatements. It does not prove the exact 90 percent repost penalty. It does not prove a visible dormancy penalty for posting too little. It does not prove that every creator with declining engagement is trapped in an unavoidable algorithmic death spiral. It does not prove intentional harm to artists. It does not prove that all posts from three days ago are impossible to serve anywhere on X.

These distinctions weaken the viral rhetoric. They strengthen the serious critique.

The serious critique is not that X secretly inserted a villain switch labelled “destroy artists”. The serious critique is that the architecture has moved the centre of legitimacy from explicit social choice to inferred behavioural probability. That is more important than intent. An allocation system can be damaging without being malicious. The bureaucracy of prediction does not need hatred. It only needs metrics.

What a real correction would look like

The most straightforward correction is not a public-relations thread about how recommendation systems are complicated. Creators already know they are complicated. That is the problem. The correction is institutional separation.

For You should be discovery. Following should be subscription. The two should not be blended into a fog where explicit user choice is repeatedly overridden by opaque prediction. A real Following feed should respect the deliberate act of following as a first-class signal, not as an optional candidate source that must survive the same prediction contest as random global content. Discovery can remain model-driven. Subscription should be relationship-driven.

X should also provide creator-side observability. A creator does not need full model weights. That would be unrealistic. But they should be able to see whether impressions came from followers, non-followers, profile visits, search, conversation contexts, reposts or For You discovery. They should be able to distinguish deduplication from suppression, age filtering from low prediction, negative feedback from lack of distribution. Without that, the platform manufactures superstition. Creators start changing posting times, formats, topics and frequencies without knowing which mechanism they are fighting.

The company should publicly document ranking-affecting mechanisms at a higher level of precision: the difference between follower count and follow relation, the role of author diversity, the lifetime of in-network candidate pools, the meaning of repost deduplication, and the separation between visibility filtering and engagement ranking. Hidden constants are one thing. Hidden civic rules are another.

Most importantly, X should admit the trade-off. A prediction-first feed may improve short-term consumption. It may produce more novelty, more scrolling, more interaction and more behavioural capture. But the cost is the erosion of earned audience. When creators say the platform no longer feels social, they are not merely whining about reach. They are describing the conversion of a relationship graph into an allocation market governed by a model.

The code does not prove a conspiracy against creators. It shows something colder and more consequential: an architecture in which explicit social commitments are demoted beneath predicted behavioural yield. That is enough to explain why creators feel dispossessed. The system may be technically elegant and product-rational from a short-session consumer perspective, but it risks converting a social network into an opaque allocation machine.

A platform that still advertises the follow button should make following matter. A platform that depends on creators should let them understand how their work is distributed. A platform that calls itself a town square should not make attendance conditional on whatever the prediction engine thinks will extract the next gesture. Discovery needs a place. Prediction needs limits. The follow graph needs restoration, not as nostalgia, but as institutional memory.

Otherwise X will have built a brilliant machine for guessing what people might engage with, and in the process forgotten why they came there in the first place.

Support Note

This work does not arise in a vacuum. It costs time, money, and infrastructure, and it has long been carried out under the conditions of a documented legal dispute that I have made public. Anyone who wishes to help ensure that my journalistic writings, my technical projects, and this form of independent work continue can share my donation page or support me directly.

Ways to support:

https://www.gofundme.com/f/rechtsverteidigung-existenzsicherung-arbeitsgericht-lisbon

Or directly here:

Spenden

A thousand thanks to all donors.

Sources

xai-org/x-algorithm, top-level README.md, sections “Components”, “Home Mixer”, “Thunder”, “Phoenix” and “How It Works”.
The README identifies the repository as the algorithm powering the For You feed on X, describes Home Mixer as the orchestration layer, Thunder as the in-network source for recent posts from followed accounts, Phoenix as retrieval and ranking infrastructure, and the pipeline stages from query hydration through sourcing, filtering, scoring and selection.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, top-level README.md, “Thunder”.
Thunder is described as an in-memory post store and realtime ingestion pipeline that serves in-network candidates from accounts the requesting user follows and trims posts older than the retention period.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, top-level README.md, “Pipeline Stages”.
Query hydration is described as fetching recent engagement history and metadata, including the following list.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/candidate.rs, PostCandidate and PhoenixScores.
The candidate structure contains author and candidate metadata, while the scoring structure defines the Phoenix score fields later consumed by WeightedScorer.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/candidate.rs ↩︎
xai-org/x-algorithm, phoenix/README.md, “Retrieval: Two-Tower Model”.
The README describes user-tower encoding, candidate-tower encoding and dot-product similarity search for retrieval.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/phoenix/README.md ↩︎
xai-org/x-algorithm, phoenix/README.md, introductory note, “Ranking: Transformer with Candidate Isolation” and “Attention Mask: Candidate Isolation”.
The README states that the sample transformer implementation is adapted from the Grok-1 open-source release and is representative of the internally used model except for specific scaling optimisations. It also explains that candidates can attend to user and history context but not to each other.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/phoenix/README.md ↩︎
xai-org/x-algorithm, top-level README.md, “Scoring and Ranking”.
The README lists predicted action probabilities including favourite, reply, repost, quote, click, profile click, video view, photo expand, share, dwell, follow author, not interested, block author, mute author and report.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/candidate.rs, PhoenixScores.
The code defines score fields used downstream, including favourite, reply, retweet, photo expand, click, profile click, VQV, share, share via DM, share via copy link, dwell, quote, quoted click, follow author, not interested, block author, mute author, report and continuous dwell time.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/candidate.rs ↩︎
xai-org/x-algorithm, home-mixer/scorers/weighted_scorer.rs, WeightedScorer::compute_weighted_score.
The scorer applies weights from params to Phoenix score fields and computes a combined score.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/weighted_scorer.rs ↩︎
xai-org/x-algorithm, home-mixer/lib.rs.
The module declarations mark clients, params and util as excluded from the open-source release for security reasons.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/lib.rs ↩︎
xai-org/x-algorithm, home-mixer/scorers/weighted_scorer.rs, WeightedScorer::compute_weighted_score; and home-mixer/lib.rs.
WeightedScorer calls normalize_score, while util is marked as excluded from the open-source release.
Sources: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/weighted_scorer.rs and https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/lib.rs ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs, construction of the scorer chain.
The pipeline instantiates PhoenixScorer, WeightedScorer, AuthorDiversityScorer and OONScorer in sequence.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs ↩︎
xai-org/x-algorithm, top-level README.md, “Scoring and Ranking”.
The README states that the weighted scorer combines predicted action probabilities into a final score using a weighted sum.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, top-level README.md, “Candidate Sourcing”.
Thunder is described as supplying recent posts from followed accounts, while Phoenix Retrieval supplies ML-discovered out-of-network posts from the global corpus.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs, PhoenixCandidatePipeline::build_with_clients.
The pipeline wires query hydrators, Phoenix and Thunder sources, hydrators, filters, scorers, selector, post-selection hydrators, post-selection filters and side effects.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs ↩︎
xai-org/x-algorithm, home-mixer/scorers/author_diversity_scorer.rs, AuthorDiversityScorer::multiplier.
The multiplier is computed as (1.0 - floor) * decay_factor.powf(position as f64) + floor, giving an exponential attenuation for repeated authors.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/author_diversity_scorer.rs ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs, scorer list in PhoenixCandidatePipeline::build_with_clients.
AuthorDiversityScorer is placed after WeightedScorer and before OONScorer.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs ↩︎
xai-org/x-algorithm, home-mixer/scorers/author_diversity_scorer.rs; and home-mixer/lib.rs.
AUTHOR_DIVERSITY_DECAY and AUTHOR_DIVERSITY_FLOOR are imported from params, while params is excluded from the open-source release.
Sources: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/author_diversity_scorer.rs and https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/lib.rs ↩︎
xai-org/x-algorithm, home-mixer/filters/retweet_deduplication_filter.rs, RetweetDeduplicationFilter.
The filter keeps the first occurrence of a tweet, whether original or retweet, and removes later duplicate retweet occurrences.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/retweet_deduplication_filter.rs ↩︎
xai-org/x-algorithm, home-mixer/filters/previously_seen_posts_filter.rs, PreviouslySeenPostsFilter.
The filter uses Bloom filters and directly provided seen IDs to remove previously seen or related posts.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_seen_posts_filter.rs ↩︎
xai-org/x-algorithm, home-mixer/filters/previously_served_posts_filter.rs, PreviouslyServedPostsFilter.
The filter removes posts already served in the current request/session context, depending on request-side data.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_served_posts_filter.rs ↩︎
xai-org/x-algorithm, thunder/posts/post_store.rs, PostStore, insert_posts, trim_old_posts and Default.
The code defines retention_seconds, filters inserted posts by retention age, trims old posts and sets the default retention to 2 * 24 * 60 * 60.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/thunder/posts/post_store.rs ↩︎
xai-org/x-algorithm, thunder/posts/post_store.rs, PostStore::start_auto_trim.
The function accepts interval_minutes as a parameter, so the code establishes the existence of periodic trimming but not, by itself, a fixed public two-minute production interval.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/thunder/posts/post_store.rs ↩︎
xai-org/x-algorithm, home-mixer/filters/age_filter.rs, AgeFilter.
The filter removes posts older than a configured max_age using tweet creation time derived from the tweet ID.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/age_filter.rs ↩︎
xai-org/x-algorithm, home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs, filter list.
The pipeline applies AgeFilter::new(Duration::from_secs(params::MAX_POST_AGE)), which means the threshold comes from excluded params.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs ↩︎
xai-org/x-algorithm, top-level README.md, “Filtering”.
The README lists AgeFilter, SelfpostFilter, RepostDeduplicationFilter, PreviouslySeenPostsFilter, PreviouslyServedPostsFilter, AuthorSocialgraphFilter, VFFilter and conversation deduplication as filtering stages.
Source: https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md ↩︎

Deep Research Prompt Codeanalysis

AUFGABE

Führe eine tiefgehende, quellennah arbeitende Analyse des veröffentlichten Sourcecodes von X zur Empfehlung und Ausspielung von Nutzer-Posts durch.

ZIEL

Ich will verstehen,

1) wie die technische Pipeline zur Ausspielung von Posts aufgebaut ist,
2) welche Komponenten für Candidate Retrieval, Ranking, Filtering, Mixing und Serving zuständig sind,
3) welche konkreten Metriken, Features, Scores, Heuristiken und Optimierungsziele im Code verwendet werden, um Posts an Nutzer auszuspielen,
4) welche Faktoren Reichweite, Sichtbarkeit, Priorisierung, Downranking oder Ausschluss beeinflussen,
5) welche Punkte durch den veröffentlichten Code wirklich belegt sind und welche nur indirekt erschlossen werden können.

ARBEITSMODUS

Arbeite streng quellennah und in dieser Reihenfolge:

1. Quelleninventar erstellen
   - Identifiziere und öffne zuerst die offiziellen öffentlichen Quellen von X:
     - das Haupt-Repository `twitter/the-algorithm`
     - das ML-Repository `twitter/the-algorithm-ml`
     - den offiziellen Engineering-Beitrag zur Recommendation-Architektur
     - weitere offizielle Doku-Dateien innerhalb der Repositories, insbesondere README-Dateien, Architekturhinweise, Signal-/Feature-Dokumente und relevante Unterordner.
   - Nutze keine Drittanalysen, Blogposts, YouTube-Videos oder Social-Media-Threads als Primärbasis.
   - Externe Sekundärquellen sind nur zulässig, wenn sie helfen, im Code genannte Fachbegriffe oder Papers einzuordnen, und müssen klar als Sekundärquelle markiert werden.

2. Systemarchitektur rekonstruieren
   - Rekonstruiere die End-to-End-Pipeline so präzise wie möglich:
     - Candidate Sources
     - In-Network Retrieval
     - Out-of-Network Retrieval
     - Graph-basierte Retrieval-Wege
     - Embedding-basierte Retrieval-Wege
     - Light Ranking
     - Heavy Ranking
     - Heuristiken und Business Rules
     - Visibility Filtering / Trust & Safety / Downranking
     - Mixing mit Ads, Follow Recommendations oder anderen Einblendungen
     - Serving an die Timeline bzw. andere Produktoberflächen
   - Ordne jeder Stufe die verantwortlichen Services, Modelle, Datenquellen und Feature-Generatoren zu.
   - Beschreibe nicht nur abstrakt, sondern nenne konkrete Modulnamen, Pfade, Klassen, Konfigurationsdateien, Feature-Definitionen und Scoring-Stellen.

3. Metriken und Ranking-Signale extrahieren
   - Ermittle systematisch, welche Signale im veröffentlichten Code für Ranking und Ausspielung verwendet werden.
   - Suche insbesondere nach:
     - Engagement-Proxies
     - Like-, Reply-, Repost-, Click-, Dwell-, Profile-Visit-, Follow-, Negative-Feedback- und Hide-Signalen
     - Graph-Signalen
     - Similarity-Scores
     - Real-Graph-Scores
     - Reputation-Scores
     - Social-Proof-Signalen
     - Author-Diversity-Regeln
     - Content-Balance-Regeln
     - Fatigue-, Freshness- und Recency-Signalen
     - Visibility- und Safety-Filtern
     - NSFW-, Abuse-, Block-, Mute- oder Quality-bezogenen Regeln
     - Features auf Autor-, Viewer-, Tweet-, Edge- und Conversation-Ebene
     - Multi-task Objectives
     - Logistic-Regression-, Neural-Ranking- oder Embedding-basierte Scoring-Funktionen
   - Extrahiere möglichst konkrete Bezeichner aus dem Code:
     - Feature-Namen
     - Metrik-Namen
     - Score-Namen
     - Enum-Werte
     - Gewichtungen
     - Schwellenwerte
     - Konfigurationsparameter
     - Ranking-Pipelines
     - Modell-Outputs
   - Wo Gewichtungen oder Score-Kombinationen sichtbar sind, stelle sie explizit dar.
   - Wo keine Gewichte offengelegt sind, markiere das klar.

4. Unterschied zwischen belegten Fakten und Inferenz
   - Trenne in der gesamten Analyse strikt zwischen:
     A. direkt im Code oder in offizieller Doku belegten Aussagen,
     B. aus Struktur und Benennung stark naheliegenden Inferenzschlüssen,
     C. offenen Punkten, die wegen fehlender Trainingsdaten, fehlender Gewichte oder nicht veröffentlichter interner Komponenten nicht sicher beantwortet werden können.
   - Nichts als gesichert ausgeben, was der veröffentlichte Code nicht wirklich hergibt.

5. Reverse-Engineering der Ausspiel-Logik
   - Beantworte anhand des Codes möglichst konkret:
     - Was erhöht die Wahrscheinlichkeit, dass ein Post im „For You“-Feed erscheint?
     - Welche Rolle spielen Accounts, denen der Nutzer folgt?
     - Welche Rolle spielen Interaktionen von Accounts, denen der Nutzer folgt?
     - Welche Rolle spielen Nutzer mit ähnlichen Interessen?
     - Welche Rolle spielen Embeddings und Community-Zuordnungen?
     - Welche Faktoren führen zu Downranking oder Ausschluss?
     - Wie werden Replies, Threads, Edited Posts, Social Proof und negative Feedback-Signale behandelt?
     - Gibt es erkennbare Mechanismen zur Qualitätskontrolle oder Reichweitenbegrenzung?
     - Gibt es Hinweise auf Fairness-, Diversity-, Safety- oder Revenue-getriebene Eingriffe?

6. Code-nahe Belegstellen liefern
   - Gib für jede wesentliche Aussage konkrete Belegstellen an:
     - Repository
     - Pfad
     - Datei
     - Klasse / Funktion / Objekt / Konfiguration
     - wenn möglich zentrale Codezeilen oder definierende Blöcke
   - Falls der Agent keine exakten Zeilennummern sauber liefern kann, sollen mindestens Pfad und die relevante Funktion oder Definition genannt werden.

7. Ergebnisdarstellung
   - Erstelle die Antwort in folgender Struktur:

   TEIL I – Executive Summary
   - Kurzfassung in präzisem Deutsch:
     - Wie arbeitet X bei der Ausspielung von Posts?
     - Was sind die wichtigsten Ranking-Metriken?
     - Was scheint besonders einflussreich zu sein?
     - Was bleibt offen?

   TEIL II – Architekturübersicht
   - Pipeline von Retrieval bis Serving
   - Hauptkomponenten und ihre Aufgaben

   TEIL III – Ranking-Metriken und Signale
   - Tabelle mit den Spalten:
     - Kategorie
     - Signal / Metrik / Score
     - technische Quelle im Code
     - vermutete Funktion
     - Belegstatus: direkt belegt / starke Inferenz / offen

   TEIL IV – Heuristiken, Filter und Eingriffe
   - Alles, was nicht klassisches Modell-Ranking ist:
     - Visibility Filtering
     - Downranking
     - Diversity
     - Fatigue
     - Social Proof
     - Safety
     - Policy- oder Revenue-nahe Eingriffe

   TEIL V – Was die Reichweite faktisch treiben duerfte
   - Eine nüchterne Synthese:
     - Welche Metriken und Features scheinen für Reichweite und Verteilung besonders relevant?
     - Welche Arten von Nutzerverhalten werden offenbar belohnt?
     - Welche Arten von Inhalten oder Accounts werden eher gebremst?

   TEIL VI – Grenzen der Analyse
   - Welche Teile sind wegen nicht veröffentlichter Trainingsdaten, Modellgewichte oder interner Dienste nicht sicher rekonstruierbar?

   TEIL VII – Appendix
   - geordnete Liste der wichtigsten Dateien, Pfade, Services und Modelle
   - optionale Glossar-Sektion zu Real Graph, GraphJet, TwHIN, Heavy Ranker, Home Mixer usw.

BESONDERE VORGABEN

- Arbeite nicht journalistisch-essayistisch, sondern technisch-analytisch.
- Keine moralischen oder politischen Vorannahmen.
- Keine „Algorithmus erklärt in einfachen Worten“-Popularisierung, sondern code-nahe Rekonstruktion.
- Wo immer moeglich: vom konkreten Code zur Funktion, nicht von allgemeinen Empfehlungslogiken auf X schliessen.
- Wenn der veröffentlichte Code alt, unvollständig oder nicht identisch zur Produktionsversion erscheint, muss das ausdrücklich benannt und in den Schlussfolgerungen berücksichtigt werden.
- Wenn Konfigurationsnamen, Feature-Namen oder Modelloutputs selbsterklärend wirken, ist das als Indiz nützlich, aber nicht als Vollbeweis zu behandeln.
- Liefere am Ende zusätzlich eine kurze Sektion:
  „Praktische Implikationen für Nutzer, die organische Reichweite verstehen wollen“
  – aber nur auf Basis dessen, was wirklich aus dem Code ableitbar ist.

ZITIERSTANDARD

- Belege nahe an der Aussage.
- Primärquellen vor Sekundärquellen.
- Keine Behauptung ohne nachvollziehbare Fundstelle.
- Unterscheide sauber zwischen Beobachtung, Modell, Interpretation und Werturteil.

Erstelle zusätzlich ein maschinenlesbares Artefakt, etwa eine CSV oder Markdown-Tabelle mit feature_name | komponentenpfad | signaltyp | belegstatus | kommentar.

AUFGABE

Führe eine tiefgehende, quellennah arbeitende Analyse des veröffentlichten Sourcecodes von X zur Empfehlung und Ausspielung von Nutzer-Posts durch.

ZIEL

Ich will verstehen,

1) wie die technische Pipeline zur Ausspielung von Posts aufgebaut ist,
2) welche Komponenten für Candidate Retrieval, Ranking, Filtering, Mixing und Serving zuständig sind,
3) welche konkreten Metriken, Features, Scores, Heuristiken und Optimierungsziele im Code verwendet werden, um Posts an Nutzer auszuspielen,
4) welche Faktoren Reichweite, Sichtbarkeit, Priorisierung, Downranking oder Ausschluss beeinflussen,
5) welche Punkte durch den veröffentlichten Code wirklich belegt sind und welche nur indirekt erschlossen werden können.

ARBEITSMODUS

Arbeite streng quellennah und in dieser Reihenfolge:

1. Quelleninventar erstellen
   - Identifiziere und öffne zuerst die offiziellen öffentlichen Quellen von X:
     - das Haupt-Repository `twitter/the-algorithm`
     - das ML-Repository `twitter/the-algorithm-ml`
     - den offiziellen Engineering-Beitrag zur Recommendation-Architektur
     - weitere offizielle Doku-Dateien innerhalb der Repositories, insbesondere README-Dateien, Architekturhinweise, Signal-/Feature-Dokumente und relevante Unterordner.
   - Nutze keine Drittanalysen, Blogposts, YouTube-Videos oder Social-Media-Threads als Primärbasis.
   - Externe Sekundärquellen sind nur zulässig, wenn sie helfen, im Code genannte Fachbegriffe oder Papers einzuordnen, und müssen klar als Sekundärquelle markiert werden.

2. Systemarchitektur rekonstruieren
   - Rekonstruiere die End-to-End-Pipeline so präzise wie möglich:
     - Candidate Sources
     - In-Network Retrieval
     - Out-of-Network Retrieval
     - Graph-basierte Retrieval-Wege
     - Embedding-basierte Retrieval-Wege
     - Light Ranking
     - Heavy Ranking
     - Heuristiken und Business Rules
     - Visibility Filtering / Trust & Safety / Downranking
     - Mixing mit Ads, Follow Recommendations oder anderen Einblendungen
     - Serving an die Timeline bzw. andere Produktoberflächen
   - Ordne jeder Stufe die verantwortlichen Services, Modelle, Datenquellen und Feature-Generatoren zu.
   - Beschreibe nicht nur abstrakt, sondern nenne konkrete Modulnamen, Pfade, Klassen, Konfigurationsdateien, Feature-Definitionen und Scoring-Stellen.

3. Metriken und Ranking-Signale extrahieren
   - Ermittle systematisch, welche Signale im veröffentlichten Code für Ranking und Ausspielung verwendet werden.
   - Suche insbesondere nach:
     - Engagement-Proxies
     - Like-, Reply-, Repost-, Click-, Dwell-, Profile-Visit-, Follow-, Negative-Feedback- und Hide-Signalen
     - Graph-Signalen
     - Similarity-Scores
     - Real-Graph-Scores
     - Reputation-Scores
     - Social-Proof-Signalen
     - Author-Diversity-Regeln
     - Content-Balance-Regeln
     - Fatigue-, Freshness- und Recency-Signalen
     - Visibility- und Safety-Filtern
     - NSFW-, Abuse-, Block-, Mute- oder Quality-bezogenen Regeln
     - Features auf Autor-, Viewer-, Tweet-, Edge- und Conversation-Ebene
     - Multi-task Objectives
     - Logistic-Regression-, Neural-Ranking- oder Embedding-basierte Scoring-Funktionen
   - Extrahiere möglichst konkrete Bezeichner aus dem Code:
     - Feature-Namen
     - Metrik-Namen
     - Score-Namen
     - Enum-Werte
     - Gewichtungen
     - Schwellenwerte
     - Konfigurationsparameter
     - Ranking-Pipelines
     - Modell-Outputs
   - Wo Gewichtungen oder Score-Kombinationen sichtbar sind, stelle sie explizit dar.
   - Wo keine Gewichte offengelegt sind, markiere das klar.

4. Unterschied zwischen belegten Fakten und Inferenz
   - Trenne in der gesamten Analyse strikt zwischen:
     A. direkt im Code oder in offizieller Doku belegten Aussagen,
     B. aus Struktur und Benennung stark naheliegenden Inferenzschlüssen,
     C. offenen Punkten, die wegen fehlender Trainingsdaten, fehlender Gewichte oder nicht veröffentlichter interner Komponenten nicht sicher beantwortet werden können.
   - Nichts als gesichert ausgeben, was der veröffentlichte Code nicht wirklich hergibt.

5. Reverse-Engineering der Ausspiel-Logik
   - Beantworte anhand des Codes möglichst konkret:
     - Was erhöht die Wahrscheinlichkeit, dass ein Post im „For You“-Feed erscheint?
     - Welche Rolle spielen Accounts, denen der Nutzer folgt?
     - Welche Rolle spielen Interaktionen von Accounts, denen der Nutzer folgt?
     - Welche Rolle spielen Nutzer mit ähnlichen Interessen?
     - Welche Rolle spielen Embeddings und Community-Zuordnungen?
     - Welche Faktoren führen zu Downranking oder Ausschluss?
     - Wie werden Replies, Threads, Edited Posts, Social Proof und negative Feedback-Signale behandelt?
     - Gibt es erkennbare Mechanismen zur Qualitätskontrolle oder Reichweitenbegrenzung?
     - Gibt es Hinweise auf Fairness-, Diversity-, Safety- oder Revenue-getriebene Eingriffe?

6. Code-nahe Belegstellen liefern
   - Gib für jede wesentliche Aussage konkrete Belegstellen an:
     - Repository
     - Pfad
     - Datei
     - Klasse / Funktion / Objekt / Konfiguration
     - wenn möglich zentrale Codezeilen oder definierende Blöcke
   - Falls der Agent keine exakten Zeilennummern sauber liefern kann, sollen mindestens Pfad und die relevante Funktion oder Definition genannt werden.

7. Ergebnisdarstellung
   - Erstelle die Antwort in folgender Struktur:

   TEIL I – Executive Summary
   - Kurzfassung in präzisem Deutsch:
     - Wie arbeitet X bei der Ausspielung von Posts?
     - Was sind die wichtigsten Ranking-Metriken?
     - Was scheint besonders einflussreich zu sein?
     - Was bleibt offen?

   TEIL II – Architekturübersicht
   - Pipeline von Retrieval bis Serving
   - Hauptkomponenten und ihre Aufgaben

   TEIL III – Ranking-Metriken und Signale
   - Tabelle mit den Spalten:
     - Kategorie
     - Signal / Metrik / Score
     - technische Quelle im Code
     - vermutete Funktion
     - Belegstatus: direkt belegt / starke Inferenz / offen

   TEIL IV – Heuristiken, Filter und Eingriffe
   - Alles, was nicht klassisches Modell-Ranking ist:
     - Visibility Filtering
     - Downranking
     - Diversity
     - Fatigue
     - Social Proof
     - Safety
     - Policy- oder Revenue-nahe Eingriffe

   TEIL V – Was die Reichweite faktisch treiben duerfte
   - Eine nüchterne Synthese:
     - Welche Metriken und Features scheinen für Reichweite und Verteilung besonders relevant?
     - Welche Arten von Nutzerverhalten werden offenbar belohnt?
     - Welche Arten von Inhalten oder Accounts werden eher gebremst?

   TEIL VI – Grenzen der Analyse
   - Welche Teile sind wegen nicht veröffentlichter Trainingsdaten, Modellgewichte oder interner Dienste nicht sicher rekonstruierbar?

   TEIL VII – Appendix
   - geordnete Liste der wichtigsten Dateien, Pfade, Services und Modelle
   - optionale Glossar-Sektion zu Real Graph, GraphJet, TwHIN, Heavy Ranker, Home Mixer usw.

BESONDERE VORGABEN

- Arbeite nicht journalistisch-essayistisch, sondern technisch-analytisch.
- Keine moralischen oder politischen Vorannahmen.
- Keine „Algorithmus erklärt in einfachen Worten“-Popularisierung, sondern code-nahe Rekonstruktion.
- Wo immer moeglich: vom konkreten Code zur Funktion, nicht von allgemeinen Empfehlungslogiken auf X schliessen.
- Wenn der veröffentlichte Code alt, unvollständig oder nicht identisch zur Produktionsversion erscheint, muss das ausdrücklich benannt und in den Schlussfolgerungen berücksichtigt werden.
- Wenn Konfigurationsnamen, Feature-Namen oder Modelloutputs selbsterklärend wirken, ist das als Indiz nützlich, aber nicht als Vollbeweis zu behandeln.
- Liefere am Ende zusätzlich eine kurze Sektion:
  „Praktische Implikationen für Nutzer, die organische Reichweite verstehen wollen“
  – aber nur auf Basis dessen, was wirklich aus dem Code ableitbar ist.

ZITIERSTANDARD

- Belege nahe an der Aussage.
- Primärquellen vor Sekundärquellen.
- Keine Behauptung ohne nachvollziehbare Fundstelle.
- Unterscheide sauber zwischen Beobachtung, Modell, Interpretation und Werturteil.

Erstelle zusätzlich ein maschinenlesbares Artefakt, etwa eine CSV oder Markdown-Tabelle mit feature_name | komponentenpfad | signaltyp | belegstatus | kommentar.

Deep Research Prompt Codeanalysis – ChatGPT 5.5 Pro Codeanalysis Report

2026.04.24_ChatGPT_Agent_Analysis_X_Repository

Deep Research Prompt

AUFGABE

Führe eine Deep Research in voller methodischer Strenge zum öffentlich veröffentlichten Repository `xai-org/x-algorithm` durch.

Ziel ist eine quellennah belegte Prüfung strittiger Behauptungen aus einem viralen Creator-Thread über Reichweitenverlust auf X, insbesondere zu Phoenix, Thunder, Follower Count, Prediction Trap, Author Diversity, Repost-/Retweet-Deduplication und 48h-Retention.

Dies ist keine allgemeine Erklärung des X-Algorithmus. Es geht um eine Claim-by-Claim-Prüfung: Was ist durch den veröffentlichten Code direkt belegt, was ist nur teilweise belegt, was ist Interpretation, was ist nicht belegbar oder überzogen?

PRIORITÄT

- Arbeite ausschliesslich mit Primärquellen aus `https://github.com/xai-org/x-algorithm`.
- Verwende bevorzugt Raw-Dateien von `raw.githubusercontent.com`.
- Nutze README-Dateien, Rust-/Python-Dateien, Konfigurationsdateien und Verzeichnisstrukturen.
- Drittquellen, Blogs, X-Threads, YouTube-Videos, Medium-Posts, Hacker-News-Kommentare und Medienberichte dürfen nicht als Beleg verwendet werden.
- Der virale Creator-Thread ist nur ein Behauptungskatalog, keine Tatsachenquelle.
- Jede wesentliche Aussage muss mit Repo-Pfad, Datei, Symbol/Funktion und Zeilenangabe belegt werden.
- Unterscheide strikt zwischen:
1. direktem Codebefund,
2. Dokumentationsbefund,
3. starker Inferenz,
4. schwacher Inferenz,
5. offenem Punkt wegen nicht veröffentlichter Parameter oder Produktionskonfiguration,
6. nicht belegter oder widerlegter Behauptung.

TECHNISCHER ARBEITSMODUS

Arbeite robust gegen GitHub-UI-Probleme:

1. Öffne zuerst:
- `https://github.com/xai-org/x-algorithm`
- `https://github.com/xai-org/x-algorithm/tree/main/home-mixer`
- `https://github.com/xai-org/x-algorithm/tree/main/phoenix`
- `https://github.com/xai-org/x-algorithm/tree/main/thunder`
- `https://github.com/xai-org/x-algorithm/tree/main/candidate-pipeline`

2. Nutze für konkrete Dateien nach Möglichkeit Raw-URLs, z.B.:
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/phoenix/README.md`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/lib.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/weighted_scorer.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/author_diversity_scorer.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/age_filter.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/retweet_deduplication_filter.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_seen_posts_filter.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_served_posts_filter.rs`
- `https://raw.githubusercontent.com/xai-org/x-algorithm/main/thunder/posts/post_store.rs`

3. Falls Dateien fehlen oder GitHub-Rendering hängt:
- dokumentiere den Zugriffsausfall,
- nutze alternative Raw-Pfade,
- prüfe Verzeichnislisten über die GitHub-Weboberfläche,
- arbeite in kleinen Dateigruppen statt repo-weit auf einmal.

4. Ermittele den Repository-Status:
- Owner/Organisation,
- Branch,
- Commit-Anzahl,
- letzter Commit,
- Release-Status,
- Lizenz,
- README-Disclaimer,
- Hinweise auf Beispielcode, repräsentativen Code, ausgeschlossene Module oder nicht veröffentlichte Produktionsparameter.

KERNFRAGEN

I. Repository-Status und Grenzen des offenen Codes

Prüfe:

- Enthält das Repository vollständigen Produktionscode oder repräsentativen Beispielcode?
- Welche Module sind ausdrücklich aus dem Open-Source-Release ausgeschlossen?
- Gibt es Hinweise auf fehlende Clients, Parameter, Runtime-Konfigurationen, Modellgewichte, Trainingsdaten oder interne Services?
- Welche Aussage lässt sich über Produktionsgleichheit treffen?

Besonders zu prüfen:
- `home-mixer/lib.rs`
- `phoenix/README.md`
- Top-Level `README.md`

II. Follower Count

Prüfe die Behauptung:

„Follower count does nothing for reach anymore. It is only display data and not fed into any scorer.“

Suche nach:

- `followers_count`
- `follower_count`
- `author_followers_count`
- `social_proof`
- `following_list`
- `viewer_follows_author`
- `author_is_followed`
- `author_reputation`
- `author_popularity`
- `gizmoduck`
- `socialgraph`
- `author_id`
- `in_network`

Zu prüfen:

1. Gibt es ein numerisches Followercount-Feature?
2. Wird dieses Feature in Scorern verwendet?
3. Wird es nur in Candidate Hydrators oder Author-Metadaten verwendet?
4. Welche Rolle spielt die Following List?
5. Welche Rolle spielt Thunder als In-Network-Quelle?
6. Gibt es Socialgraph-Filter oder Follow-Beziehungslogik?
7. Bedeutet fehlender direkter Followercount im Score, dass Follower für Reichweite irrelevant sind?

Trenne ausdrücklich:

- Followerzahl als numerisches Feature,
- Following List als Candidate-Retrieval-Quelle,
- Follow-Beziehung als Graph- oder Filter-Signal,
- Author-Metadaten zur Anzeige,
- historische Interaktionen von Followern als indirekte Reichweitensignale.

III. Phoenix Prediction Heads

Prüfe die Behauptung:

„Phoenix uses 19 prediction heads and predicts reach before anyone sees the post.“

Zu prüfen:

- Welche Prediction Heads nennt das Top-Level README?
- Welche Prediction Heads nennt `phoenix/README.md`?
- Welche Score-Felder existieren im Code?
- Gibt es genau 15, 19 oder eine andere Zahl?
- Welche Felder existieren nur im `WeightedScorer`, aber nicht in der README-Liste?

Suche konkret nach:

- `favorite_score`
- `reply_score`
- `retweet_score`
- `repost_score`
- `quote_score`
- `click_score`
- `profile_click_score`
- `video_view_score`
- `vqv_score`
- `photo_expand_score`
- `share_score`
- `share_via_dm_score`
- `share_via_copy_link_score`
- `quoted_click_score`
- `dwell_score`
- `dwell_time`
- `follow_author_score`
- `not_interested_score`
- `block_author_score`
- `mute_author_score`
- `report_score`

Bewerte:

- Welche sind direkt belegt?
- Welche werden im finalen Score kombiniert?
- Welche sind nur README-Dokumentation?
- Ist die Behauptung „19 heads“ belegt?
- Ist „Prediction Trap“ als Codebefund belegt oder nur als systemtheoretische Interpretation?

IV. Weighted Scorer

Öffne und analysiere:

- `home-mixer/scorers/weighted_scorer.rs`
- `home-mixer/lib.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`

Prüfe:

1. Welche Scores werden exakt kombiniert?
2. Nutzt der Code eine gewichtete Summe?
3. Woher kommen die Gewichte?
4. Ist `params` öffentlich?
5. Ist `params` im Open-Source-Release ausgeschlossen?
6. Sind konkrete Gewichtungswerte öffentlich sichtbar?
7. Welche negativen Signale gehen in den Score ein?
8. Welche positiven Signale gehen in den Score ein?
9. Welche Rolle spielt `normalize_score`?
10. Ist `util` öffentlich oder ebenfalls ausgeschlossen?

Bewerte:

- Darf man öffentlich behaupten, Replies seien stärker als Likes?
- Darf man öffentlich konkrete Gewichtswerte behaupten?
- Darf man nur sagen: „Diese Signale werden gewichtet kombiniert, aber die konkreten Gewichte sind im offenen Release nicht sichtbar“?

V. Author Diversity

Öffne und analysiere:

- `home-mixer/scorers/author_diversity_scorer.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
- `home-mixer/lib.rs`

Prüfe:

1. Existiert `AuthorDiversityScorer`?
2. Wird er in der Pipeline nach `WeightedScorer` angewandt?
3. Nutzt er exponentiellen Decay?
4. Wie lautet die Formel?
5. Worauf bezieht sich `position`?
6. Zählt `author_counts` innerhalb einer Candidate-Liste?
7. Ist der Decay pro Feed-Response, pro Request, pro Session oder über längere Zeiträume?
8. Woher kommen `AUTHOR_DIVERSITY_DECAY` und `AUTHOR_DIVERSITY_FLOOR`?
9. Sind konkrete Parameterwerte sichtbar?
10. Gibt es irgendeine Dormancy-Strafe für zu seltenes Posten?

Bewerte die Thread-Behauptung:

- „Posting too much hurts“.
- „Each additional post from you in the same session scores lower“.
- „Posting too little hurts because account looks dormant“.
- „The safe posting band is narrow and undisclosed“.

VI. Retention / Age Filter / 48h

Öffne und analysiere:

- `thunder/posts/post_store.rs`
- `home-mixer/filters/age_filter.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
- Top-Level `README.md`

Suche nach:

- `retention_seconds`
- `2 * 24 * 60 * 60`
- `trim_old_posts`
- `start_auto_trim`
- `interval_minutes`
- `MAX_POST_AGE`
- `AgeFilter`
- `Thunder`
- `Phoenix Retrieval`

Prüfe:

1. Ist der 48h-Default in Thunder belegt?
2. Wo exakt wird dieser Default gesetzt?
3. Werden beim Insert ältere Posts verworfen?
4. Entfernt `trim_old_posts` alte Posts?
5. Existiert `start_auto_trim(interval_minutes)`?
6. Ist ein konkretes Intervall von 2 Minuten belegt?
7. Wird `AgeFilter` in der Phoenix Candidate Pipeline angewandt?
8. Kommt `MAX_POST_AGE` aus `params`?
9. Ist `params` öffentlich oder ausgeschlossen?
10. Gilt 48h nur für Thunder oder auch für Phoenix?
11. Belegt der Code „global aus dem System verschwunden“?
12. Gibt es separate Wege für ältere Posts über Search, Profile, Conversation, Quotes, External Links oder andere Produktflächen?

Bewerte die Thread-Behauptung:

- „Posts are gone from the system after 48 hours.“
- „Thunder auto-trims every 2 minutes.“
- „Nothing from 3 days ago is shown to anyone.“
- „You start from zero every 48 hours.“

VII. Repost / Retweet Deduplication

Öffne und analysiere:

- `home-mixer/filters/retweet_deduplication_filter.rs`
- `home-mixer/filters/previously_seen_posts_filter.rs`
- `home-mixer/filters/previously_served_posts_filter.rs`
- Dateien mit `BloomFilter`
- Kandidatenstruktur mit Feldern wie `retweeted_tweet_id`, `source_post_id`, `is_retweet`, `is_quote`, falls vorhanden

Prüfe:

1. Was dedupliziert `RetweetDeduplicationFilter` exakt?
2. Wird ein Repost/Retweet entfernt oder downranked?
3. Wird zwischen Original, Retweet, Repost und Quote unterschieden?
4. Gibt es eine 90%-Impression-Deduction im Code?
5. Gibt es eine Score-Deduction im Code?
6. Gibt es eine Bloom-Filter-Logik?
7. Wird Bloom Filter clientseitig, requestseitig oder sessionseitig verwendet?
8. Gibt es Hinweise, dass Self-Reposts anders behandelt werden?
9. Gibt es Hinweise, dass Reposts fremder Künstler pauschal „buried“ werden?
10. Gibt es nur Deduplication bereits gesehener beziehungsweise verwandter Posts?

Bewerte die Thread-Behauptung:

- „Every repost another artist’s work gets buried.“
- „Up to 90% impression deduction.“
- „Self-reposting is viable because Bloom filter resets by session.“
- „RetweetDeduplicationFilter only drops self-reposts for followers who already saw the original in same session.“

VIII. Creator-Reichweite

Prüfe, was aus dem Code wirklich ableitbar ist:

1. Helfen Replies stärker als Likes?
2. Helfen Photo Expand und Dwell visuellen Posts?
3. Drücken negative Feedbacks den Score?
4. Konkurrieren mehrere Posts desselben Autors in einer Candidate-Liste miteinander?
5. Hilft die Follower-Beziehung nur beim Candidate Retrieval oder auch beim finalen Score?
6. Ist eine dauerhafte Downward Spiral codebasiert belegt?
7. Ist Ragebait strukturell bevorzugt oder nur über Reply-/Dwell-/Engagement-Prognosen plausibel?
8. Werden Künstler oder Bildposts systematisch benachteiligt?
9. Ist „Content Quality is secondary“ codebasiert belegbar?
10. Welche Aussagen bleiben ohne Experimente unentscheidbar?

AUSGABEFORMAT

Erstelle zuerst eine kurze Vorbemerkung:

- analysiertes Repository,
- Datum der Recherche,
- Branch/Commit soweit sichtbar,
- Hinweis auf Open-Source-Grenzen.

Dann erstelle eine Claim-by-Claim-Prüfung.

Nutze folgende Tabelle, aber halte Einträge knapp:

Statuswerte nur aus dieser Liste:

- direkt belegt
- teilweise belegt
- starke Inferenz
- schwache Inferenz
- nicht belegt
- widerlegt
- wegen excluded params/runtime config offen

Für jede Claim-Gruppe folgt nach der Tabelle eine kurze prosebasierte Bewertung mit:

- Beobachtung,
- Modell,
- Interpretation,
- Werturteil.

Keine langen Sätze in Tabellen. Längere Erklärungen gehören in den Fliesstext.

ABSCHLUSSBEWERTUNG

Schliesse mit fünf Abschnitten:

1. Welche Thread-Claims stimmen?
2. Welche Thread-Claims sind überzogen?
3. Welche Thread-Claims sind nicht belegbar?
4. Welche Aussagen sollte man öffentlich nicht als Fakt behaupten?
5. Welche Experimente wären nötig, um reale Reichweitenwirkung zu testen?

EXPERIMENTVORSCHLÄGE

Schlage codeunabhängige Tests vor, aber klar getrennt von der Codeanalyse:

- kontrollierte Posting-Frequenz über mehrere Accounts,
- Bildposts vs Textposts,
- Replies vs Likes vs Reposts,
- Self-reposts vs Fremdreposts,
- 24h/48h/72h-Impression-Curves,
- Follower-only vs Out-of-Network-Impressions,
- negative Feedback Proxy, soweit ethisch und praktisch messbar.

STRIKTE NEGATIVREGELN

- Keine Drittquelle als Beleg.
- Keine Spekulation als Fakt.
- Keine politischen oder moralischen Vorannahmen.
- Keine Creator-Ratgeber-Prosa ohne Codebasis.
- Keine Behauptung über Produktionsverhalten, wenn der offene Code nur repräsentativ ist.
- Keine pauschale Aussage „der Algorithmus macht X“, wenn nur ein Teilpfad wie Thunder, Phoenix oder Home Mixer belegt ist.
- Keine Gleichsetzung von Deduplication mit Reichweitenstrafe.
- Keine Gleichsetzung von 48h Thunder-Retention mit globaler Content-Lebensdauer.
- Keine Gleichsetzung von fehlendem Followercount-Feature mit irrelevanter Followerbeziehung.

AUFGABE

Führe eine Deep Research in voller methodischer Strenge zum öffentlich veröffentlichten Repository `xai-org/x-algorithm` durch.

Ziel ist eine quellennah belegte Prüfung strittiger Behauptungen aus einem viralen Creator-Thread über Reichweitenverlust auf X, insbesondere zu Phoenix, Thunder, Follower Count, Prediction Trap, Author Diversity, Repost-/Retweet-Deduplication und 48h-Retention.

Dies ist keine allgemeine Erklärung des X-Algorithmus. Es geht um eine Claim-by-Claim-Prüfung: Was ist durch den veröffentlichten Code direkt belegt, was ist nur teilweise belegt, was ist Interpretation, was ist nicht belegbar oder überzogen?

PRIORITÄT

- Arbeite ausschliesslich mit Primärquellen aus `https://github.com/xai-org/x-algorithm`.
- Verwende bevorzugt Raw-Dateien von `raw.githubusercontent.com`.
- Nutze README-Dateien, Rust-/Python-Dateien, Konfigurationsdateien und Verzeichnisstrukturen.
- Drittquellen, Blogs, X-Threads, YouTube-Videos, Medium-Posts, Hacker-News-Kommentare und Medienberichte dürfen nicht als Beleg verwendet werden.
- Der virale Creator-Thread ist nur ein Behauptungskatalog, keine Tatsachenquelle.
- Jede wesentliche Aussage muss mit Repo-Pfad, Datei, Symbol/Funktion und Zeilenangabe belegt werden.
- Unterscheide strikt zwischen:
  1. direktem Codebefund,
  2. Dokumentationsbefund,
  3. starker Inferenz,
  4. schwacher Inferenz,
  5. offenem Punkt wegen nicht veröffentlichter Parameter oder Produktionskonfiguration,
  6. nicht belegter oder widerlegter Behauptung.

TECHNISCHER ARBEITSMODUS

Arbeite robust gegen GitHub-UI-Probleme:

1. Öffne zuerst:
   - `https://github.com/xai-org/x-algorithm`
   - `https://github.com/xai-org/x-algorithm/tree/main/home-mixer`
   - `https://github.com/xai-org/x-algorithm/tree/main/phoenix`
   - `https://github.com/xai-org/x-algorithm/tree/main/thunder`
   - `https://github.com/xai-org/x-algorithm/tree/main/candidate-pipeline`

2. Nutze für konkrete Dateien nach Möglichkeit Raw-URLs, z.B.:
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/README.md`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/phoenix/README.md`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/lib.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/weighted_scorer.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/scorers/author_diversity_scorer.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/age_filter.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/retweet_deduplication_filter.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_seen_posts_filter.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/home-mixer/filters/previously_served_posts_filter.rs`
   - `https://raw.githubusercontent.com/xai-org/x-algorithm/main/thunder/posts/post_store.rs`

3. Falls Dateien fehlen oder GitHub-Rendering hängt:
   - dokumentiere den Zugriffsausfall,
   - nutze alternative Raw-Pfade,
   - prüfe Verzeichnislisten über die GitHub-Weboberfläche,
   - arbeite in kleinen Dateigruppen statt repo-weit auf einmal.

4. Ermittele den Repository-Status:
   - Owner/Organisation,
   - Branch,
   - Commit-Anzahl,
   - letzter Commit,
   - Release-Status,
   - Lizenz,
   - README-Disclaimer,
   - Hinweise auf Beispielcode, repräsentativen Code, ausgeschlossene Module oder nicht veröffentlichte Produktionsparameter.

KERNFRAGEN

I. Repository-Status und Grenzen des offenen Codes

Prüfe:

- Enthält das Repository vollständigen Produktionscode oder repräsentativen Beispielcode?
- Welche Module sind ausdrücklich aus dem Open-Source-Release ausgeschlossen?
- Gibt es Hinweise auf fehlende Clients, Parameter, Runtime-Konfigurationen, Modellgewichte, Trainingsdaten oder interne Services?
- Welche Aussage lässt sich über Produktionsgleichheit treffen?

Besonders zu prüfen:
- `home-mixer/lib.rs`
- `phoenix/README.md`
- Top-Level `README.md`

II. Follower Count

Prüfe die Behauptung:

„Follower count does nothing for reach anymore. It is only display data and not fed into any scorer.“

Suche nach:

- `followers_count`
- `follower_count`
- `author_followers_count`
- `social_proof`
- `following_list`
- `viewer_follows_author`
- `author_is_followed`
- `author_reputation`
- `author_popularity`
- `gizmoduck`
- `socialgraph`
- `author_id`
- `in_network`

Zu prüfen:

1. Gibt es ein numerisches Followercount-Feature?
2. Wird dieses Feature in Scorern verwendet?
3. Wird es nur in Candidate Hydrators oder Author-Metadaten verwendet?
4. Welche Rolle spielt die Following List?
5. Welche Rolle spielt Thunder als In-Network-Quelle?
6. Gibt es Socialgraph-Filter oder Follow-Beziehungslogik?
7. Bedeutet fehlender direkter Followercount im Score, dass Follower für Reichweite irrelevant sind?

Trenne ausdrücklich:

- Followerzahl als numerisches Feature,
- Following List als Candidate-Retrieval-Quelle,
- Follow-Beziehung als Graph- oder Filter-Signal,
- Author-Metadaten zur Anzeige,
- historische Interaktionen von Followern als indirekte Reichweitensignale.

III. Phoenix Prediction Heads

Prüfe die Behauptung:

„Phoenix uses 19 prediction heads and predicts reach before anyone sees the post.“

Zu prüfen:

- Welche Prediction Heads nennt das Top-Level README?
- Welche Prediction Heads nennt `phoenix/README.md`?
- Welche Score-Felder existieren im Code?
- Gibt es genau 15, 19 oder eine andere Zahl?
- Welche Felder existieren nur im `WeightedScorer`, aber nicht in der README-Liste?

Suche konkret nach:

- `favorite_score`
- `reply_score`
- `retweet_score`
- `repost_score`
- `quote_score`
- `click_score`
- `profile_click_score`
- `video_view_score`
- `vqv_score`
- `photo_expand_score`
- `share_score`
- `share_via_dm_score`
- `share_via_copy_link_score`
- `quoted_click_score`
- `dwell_score`
- `dwell_time`
- `follow_author_score`
- `not_interested_score`
- `block_author_score`
- `mute_author_score`
- `report_score`

Bewerte:

- Welche sind direkt belegt?
- Welche werden im finalen Score kombiniert?
- Welche sind nur README-Dokumentation?
- Ist die Behauptung „19 heads“ belegt?
- Ist „Prediction Trap“ als Codebefund belegt oder nur als systemtheoretische Interpretation?

IV. Weighted Scorer

Öffne und analysiere:

- `home-mixer/scorers/weighted_scorer.rs`
- `home-mixer/lib.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`

Prüfe:

1. Welche Scores werden exakt kombiniert?
2. Nutzt der Code eine gewichtete Summe?
3. Woher kommen die Gewichte?
4. Ist `params` öffentlich?
5. Ist `params` im Open-Source-Release ausgeschlossen?
6. Sind konkrete Gewichtungswerte öffentlich sichtbar?
7. Welche negativen Signale gehen in den Score ein?
8. Welche positiven Signale gehen in den Score ein?
9. Welche Rolle spielt `normalize_score`?
10. Ist `util` öffentlich oder ebenfalls ausgeschlossen?

Bewerte:

- Darf man öffentlich behaupten, Replies seien stärker als Likes?
- Darf man öffentlich konkrete Gewichtswerte behaupten?
- Darf man nur sagen: „Diese Signale werden gewichtet kombiniert, aber die konkreten Gewichte sind im offenen Release nicht sichtbar“?

V. Author Diversity

Öffne und analysiere:

- `home-mixer/scorers/author_diversity_scorer.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
- `home-mixer/lib.rs`

Prüfe:

1. Existiert `AuthorDiversityScorer`?
2. Wird er in der Pipeline nach `WeightedScorer` angewandt?
3. Nutzt er exponentiellen Decay?
4. Wie lautet die Formel?
5. Worauf bezieht sich `position`?
6. Zählt `author_counts` innerhalb einer Candidate-Liste?
7. Ist der Decay pro Feed-Response, pro Request, pro Session oder über längere Zeiträume?
8. Woher kommen `AUTHOR_DIVERSITY_DECAY` und `AUTHOR_DIVERSITY_FLOOR`?
9. Sind konkrete Parameterwerte sichtbar?
10. Gibt es irgendeine Dormancy-Strafe für zu seltenes Posten?

Bewerte die Thread-Behauptung:

- „Posting too much hurts“.
- „Each additional post from you in the same session scores lower“.
- „Posting too little hurts because account looks dormant“.
- „The safe posting band is narrow and undisclosed“.

VI. Retention / Age Filter / 48h

Öffne und analysiere:

- `thunder/posts/post_store.rs`
- `home-mixer/filters/age_filter.rs`
- `home-mixer/candidate_pipeline/phoenix_candidate_pipeline.rs`
- Top-Level `README.md`

Suche nach:

- `retention_seconds`
- `2 * 24 * 60 * 60`
- `trim_old_posts`
- `start_auto_trim`
- `interval_minutes`
- `MAX_POST_AGE`
- `AgeFilter`
- `Thunder`
- `Phoenix Retrieval`

Prüfe:

1. Ist der 48h-Default in Thunder belegt?
2. Wo exakt wird dieser Default gesetzt?
3. Werden beim Insert ältere Posts verworfen?
4. Entfernt `trim_old_posts` alte Posts?
5. Existiert `start_auto_trim(interval_minutes)`?
6. Ist ein konkretes Intervall von 2 Minuten belegt?
7. Wird `AgeFilter` in der Phoenix Candidate Pipeline angewandt?
8. Kommt `MAX_POST_AGE` aus `params`?
9. Ist `params` öffentlich oder ausgeschlossen?
10. Gilt 48h nur für Thunder oder auch für Phoenix?
11. Belegt der Code „global aus dem System verschwunden“?
12. Gibt es separate Wege für ältere Posts über Search, Profile, Conversation, Quotes, External Links oder andere Produktflächen?

Bewerte die Thread-Behauptung:

- „Posts are gone from the system after 48 hours.“
- „Thunder auto-trims every 2 minutes.“
- „Nothing from 3 days ago is shown to anyone.“
- „You start from zero every 48 hours.“

VII. Repost / Retweet Deduplication

Öffne und analysiere:

- `home-mixer/filters/retweet_deduplication_filter.rs`
- `home-mixer/filters/previously_seen_posts_filter.rs`
- `home-mixer/filters/previously_served_posts_filter.rs`
- Dateien mit `BloomFilter`
- Kandidatenstruktur mit Feldern wie `retweeted_tweet_id`, `source_post_id`, `is_retweet`, `is_quote`, falls vorhanden

Prüfe:

1. Was dedupliziert `RetweetDeduplicationFilter` exakt?
2. Wird ein Repost/Retweet entfernt oder downranked?
3. Wird zwischen Original, Retweet, Repost und Quote unterschieden?
4. Gibt es eine 90%-Impression-Deduction im Code?
5. Gibt es eine Score-Deduction im Code?
6. Gibt es eine Bloom-Filter-Logik?
7. Wird Bloom Filter clientseitig, requestseitig oder sessionseitig verwendet?
8. Gibt es Hinweise, dass Self-Reposts anders behandelt werden?
9. Gibt es Hinweise, dass Reposts fremder Künstler pauschal „buried“ werden?
10. Gibt es nur Deduplication bereits gesehener beziehungsweise verwandter Posts?

Bewerte die Thread-Behauptung:

- „Every repost another artist’s work gets buried.“
- „Up to 90% impression deduction.“
- „Self-reposting is viable because Bloom filter resets by session.“
- „RetweetDeduplicationFilter only drops self-reposts for followers who already saw the original in same session.“

VIII. Creator-Reichweite

Prüfe, was aus dem Code wirklich ableitbar ist:

1. Helfen Replies stärker als Likes?
2. Helfen Photo Expand und Dwell visuellen Posts?
3. Drücken negative Feedbacks den Score?
4. Konkurrieren mehrere Posts desselben Autors in einer Candidate-Liste miteinander?
5. Hilft die Follower-Beziehung nur beim Candidate Retrieval oder auch beim finalen Score?
6. Ist eine dauerhafte Downward Spiral codebasiert belegt?
7. Ist Ragebait strukturell bevorzugt oder nur über Reply-/Dwell-/Engagement-Prognosen plausibel?
8. Werden Künstler oder Bildposts systematisch benachteiligt?
9. Ist „Content Quality is secondary“ codebasiert belegbar?
10. Welche Aussagen bleiben ohne Experimente unentscheidbar?

AUSGABEFORMAT

Erstelle zuerst eine kurze Vorbemerkung:

- analysiertes Repository,
- Datum der Recherche,
- Branch/Commit soweit sichtbar,
- Hinweis auf Open-Source-Grenzen.

Dann erstelle eine Claim-by-Claim-Prüfung.

Nutze folgende Tabelle, aber halte Einträge knapp:

`claim_id | claim_text | repo_path | symbol | code_evidence | status | confidence | interpretation | public_claim_safe?`

Statuswerte nur aus dieser Liste:

- direkt belegt
- teilweise belegt
- starke Inferenz
- schwache Inferenz
- nicht belegt
- widerlegt
- wegen excluded params/runtime config offen

Für jede Claim-Gruppe folgt nach der Tabelle eine kurze prosebasierte Bewertung mit:

- Beobachtung,
- Modell,
- Interpretation,
- Werturteil.

Keine langen Sätze in Tabellen. Längere Erklärungen gehören in den Fliesstext.

ABSCHLUSSBEWERTUNG

Schliesse mit fünf Abschnitten:

1. Welche Thread-Claims stimmen?
2. Welche Thread-Claims sind überzogen?
3. Welche Thread-Claims sind nicht belegbar?
4. Welche Aussagen sollte man öffentlich nicht als Fakt behaupten?
5. Welche Experimente wären nötig, um reale Reichweitenwirkung zu testen?

EXPERIMENTVORSCHLÄGE

Schlage codeunabhängige Tests vor, aber klar getrennt von der Codeanalyse:

- kontrollierte Posting-Frequenz über mehrere Accounts,
- Bildposts vs Textposts,
- Replies vs Likes vs Reposts,
- Self-reposts vs Fremdreposts,
- 24h/48h/72h-Impression-Curves,
- Follower-only vs Out-of-Network-Impressions,
- negative Feedback Proxy, soweit ethisch und praktisch messbar.

STRIKTE NEGATIVREGELN

- Keine Drittquelle als Beleg.
- Keine Spekulation als Fakt.
- Keine politischen oder moralischen Vorannahmen.
- Keine Creator-Ratgeber-Prosa ohne Codebasis.
- Keine Behauptung über Produktionsverhalten, wenn der offene Code nur repräsentativ ist.
- Keine pauschale Aussage „der Algorithmus macht X“, wenn nur ein Teilpfad wie Thunder, Phoenix oder Home Mixer belegt ist.
- Keine Gleichsetzung von Deduplication mit Reichweitenstrafe.
- Keine Gleichsetzung von 48h Thunder-Retention mit globaler Content-Lebensdauer.
- Keine Gleichsetzung von fehlendem Followercount-Feature mit irrelevanter Followerbeziehung.

Deep Research Prompt – ChatGPT 5.5 Pro Deep Research Report

2026.04.24_ChatGPT_Deep_Research_X_Repository

Deep Research Prompt – Gemini 3 Pro Deep Research Report

2026.04.24_Gemini_Deep_Research_X_Repository_extended

Changelog

Post-Publication Technical Addendum

After publication, I subjected this article to a stricter source-quality review against the public xai-org/x-algorithm repository and several independent code-analysis reports. The central thesis of the article remains intact: the newer public X recommendation architecture strongly suggests a shift from social-graph-first distribution toward prediction-first allocation. However, several formulations require tighter evidentiary boundaries.

This addendum clarifies those boundaries. It should be read as a technical qualification of the original essay, not as a withdrawal of its central argument.

The public repository must not be treated as the total production reality of X. It exposes important architectural structure: candidate sourcing, hydration, filters, scorers, retention mechanisms and the broad form of the Phoenix-based ranking pipeline. It does not expose all runtime parameters, production flags, exact model weights, training data, A/B tests, internal service implementations or policy layers. In particular, params, clients and util are not fully available in the public release. This limits any claim about exact weights, exact thresholds or exact production magnitudes.
Follower count and the follow graph must be separated. The public code does not show a simple follower-count multiplier inside the visible WeightedScorer path. That supports the narrower claim that numerical follower count is not visibly used as a direct final-score multiplier. It does not support the broader claim that followers or the social graph are irrelevant. The follow relationship remains structurally important because Thunder supplies recent in-network candidates from accounts the user follows. The better formulation is therefore: explicit follows appear demoted relative to predicted engagement in the For You allocation process, but they remain important for in-network candidate retrieval.
The „Prediction Trap“ should be understood as a system-level interpretation, not as a direct code statement. The public code supports the claim that Phoenix predicts multiple user-action probabilities before final feed selection and that these predictions are combined by WeightedScorer. This makes feedback-loop concerns structurally plausible. It does not, by itself, prove a deterministic death spiral. Such a claim would require empirical platform data about exposure, live-feedback loops, exploration, retraining, score updates and production configuration.
The public code supports the existence of multiple Phoenix score fields, including positive interaction signals and negative feedback signals. It also supports the claim that these score fields are combined through weighted scoring. It does not disclose the concrete production weights. Therefore, any claim that one specific interaction type, such as replies, is definitively weighted more than another, such as likes, must remain qualified unless the relevant parameters are publicly available.
The AuthorDiversityScorer finding remains one of the strongest code-supported points. The public implementation applies an exponential attenuation formula to repeated candidates by the same author within a ranked feed response or candidate batch. This supports the claim that multiple posts by the same author can compete against one another in the same allocation context. It does not prove a permanent account-level throttle. It also does not prove an explicit dormancy penalty for posting too little. Reduced reach after inactivity may be plausible in a prediction system because user-action sequences can become weaker or stale, but that is not the same as a visible hard-coded penalty.
Repost and retweet handling must be described as deduplication, not as a code-proven 90 percent reach penalty. The public repository supports binary filtering and seen-content logic through mechanisms such as RetweetDeduplicationFilter and PreviouslySeenPostsFilter. These mechanisms can materially reduce repeated exposure to the same underlying content, especially where audiences overlap. But the repository does not show a linear 90 percent score or impression deduction for reposting another creator’s work. Any such percentage must be treated as external policy, monetisation logic, empirical observation or creator-side interpretation unless separately supported by a primary X source.
Bloom-filter-based seen logic should not be overstated as a self-repost exemption. The public code supports probabilistic seen-content filtering. It does not establish a general rule that self-reposts are algorithmically protected while reposts of others are punished. The visible mechanism is duplicate and previously seen content management, not a creator-specific privilege system.
The 48-hour claim must be limited to the relevant subsystem. Thunder’s public PostStore has a two-day default retention window, and older posts are trimmed from that in-memory in-network candidate store. That supports a strong freshness bias in the recent in-network recommendation path. It does not prove that posts disappear globally from X after 48 hours. Older posts may remain accessible through profiles, search, direct URLs, conversation contexts, quote posts, external links or other product surfaces outside the specific Thunder candidate path. Likewise, periodic trimming exists, but a fixed public two-minute production cadence is not established by the visible code alone.
Creator statements by @blac_ai and @ianmiles should be read as experiential testimony and hypothesis-generating evidence, not as technical proof. Their complaints align with several code-supported mechanisms: prediction-first ranking, author diversity attenuation, freshness pressure, deduplication and the weakened distributional promise of the follow graph. But creator-side reach collapse does not by itself prove a specific scoring weight, hidden policy, deterministic suppression loop or exact penalty.

These clarifications do not negate the article’s core argument. They refine it. The public code does not prove that X intentionally harms creators. It does not prove a universal 48-hour content death, a code-visible 90 percent repost penalty, an explicit dormancy penalty or a deterministic death spiral. It does show a prediction-first allocation architecture in which explicit social commitments no longer operate as a simple distributional guarantee. That is sufficient to explain why many creators experience the system as opaque, unstable and socially dispossessing.

In short: the article’s central diagnosis stands, but the strongest version of that diagnosis is not „the code proves every creator complaint“. It is this: the code makes those complaints structurally intelligible, while several of the most viral numerical claims require either stronger primary evidence or careful qualification.

Categories: Digitalization, IT, USA

Tags: SocialMedia, Twitter, X

How X Turned the Social Graph into a Guessing Machine

ToC

From social graph to prediction graph

What Phoenix changes

The Weighted Scorer and the missing weights

The prediction trap

Follower devaluation without follower irrelevance

Author diversity and the volume trap

Retweets, reposts and the community problem

The 48-hour horizon

Candidate isolation and the death of context

The consumer-feed defence

Algorithmic suppression without a censor

What the code proves, what it does not

What a real correction would look like

Support Note

Sources

Deep Research Prompt Codeanalysis

Deep Research Prompt Codeanalysis – ChatGPT 5.5 Pro Codeanalysis Report

Deep Research Prompt

Deep Research Prompt – ChatGPT 5.5 Pro Deep Research Report

Deep Research Prompt – Gemini 3 Pro Deep Research Report

Changelog

Post-Publication Technical Addendum

You may also like

Le Bon und die modernen Anschlussmechanismen

Le Bon. Das Betriebssystem der Vereinfachung

Die Verfassung. Die Machtfrage.

Widgets