Materials & raw data¶
Everything behind Twilight of the Gods / Гибель богов: the raw model outputs and the analysis, so you can read the primary sources or reproduce the ranking yourself.
A ~350-line plan_node "god node" from a real LangGraph agent was handed to 11 models (5 American, 6 Chinese). Each was asked first to propose how to split the node, then to review the others' proposals. Three methods were then used to decide which models to trust.
Note
These materials are English-only and refer to the original (private) agent source by path; the agent's own code is not included. Prose & charts are licensed CC BY 4.0; the scripts are MIT.
This is an appendix of the article
These pages belong to the Twilight of the Gods post rather than the blog at large. This index is the map; every page is linked below.
The three stages¶
- Proposals — 11 models each propose how to decompose
plan_node. - Reviews — each model evaluates every proposal, across two independent runs.
- Picking whom to trust — score agreement (rank centrality), comparison by extracted theses, and a center-of-opinion / medoid method, plus a best-analyst meta-analysis.
Start here¶
- The original god node — what
plan_nodeactually did. - Reproduce — run the ranking script yourself.
Browse everything¶
The original god node¶
Proposals¶
The 11 decomposition designs.
- DeepSeek-4-Pro
- Fable-5
- Gemini-3.1-Pro
- GLM-5.1
- GPT-5.4
- GPT-5.5
- Kimi-2.6
- MiMo-2.5-Pro
- Opus
- Qwen-3.6-Plus
- Qwen-3.7-Max
Reviews¶
Every model evaluating every proposal.
- DeepSeek-4-Pro
- Fable-5
- Gemini-3.1-Pro
- GLM-5.1
- GPT-5.4
- GPT-5.5
- Kimi-2.6
- MiMo-2.5-Pro
- Opus-4.7
- Qwen-3.6-Plus
- Qwen-3.7-Max
- Rankings matrix
- Consensus ranking
Best analyst¶
The meta-analysis of who reviews best.
Theses¶
Thesis-based ranking of the reviews.
- Run 1 · Index
- Run 1 · Matrix
- Run 1 · Agreement chart
- Run 1 · Common-thesis ranking
- Run 1 · Preferability ranking
- Run 1 · Summary recap
- Run 2 · Consensus ranking
- Run 3 · Consensus ranking