awesome-list · theory-first

Why do reasoning models actually work?

A theory-and-mechanism-first map of the o-series / R1 / Claude-thinking paradigm. Eight argued chapters, five reproduction notebooks, a monthly benchmarks tracker, and explicit engagement with the faithfulness and overthinking debates most lists hedge.

8
argued chapters
60+
indexed papers
13
models compared
5
reproduction notebooks
10
benchmarks tracked
7
reading paths
The arc

From CoT prompting to RL-for-reasoning in 36 months

The 2022–2026 trajectory: chain-of-thought prompting → process reward models → test-time compute scaling → o1 → DeepSeek-R1 → the overthinking and faithfulness debates. Each transition produced a recipe; each recipe shifted what "reasoning" meant. The interactive timeline lets you click any milestone.

paper model benchmark
2022
CoT
2023
PRMs
2024-07
AlphaProof
2024-08
Snell
2024-09
o1
2025-01
R1
2025-02
Claude 3.7
2025-05
Faithfulness
2025-07
Gemini gold
2026-03
ARC-AGI-3

→ Open the interactive timeline

Eight chapters

Each argues a position about how reasoning works

Most awesome-lists aggregate titles. This one argues mechanisms. Each chapter has a TL;DR, the proposed mechanism, 10+ annotated papers, the live debates, reading paths, and an open-problems list.

→ Open the field map (interactive)

Why this list is different

Not an aggregator. An argument.

Typical awesome-reasoning list

  • Flat list of titles + URLs
  • Mixes prompt-engineering tricks with theoretical results
  • No engagement with debates or open problems
  • Closed-model marketing claims listed as fact
  • Static; ages poorly post-release-cycle

Awesome Reasoning Models Theory

  • Chapter-as-position; each argues a mechanism
  • Five-criterion bar for entries (primary source, mechanism-not-phenomenon)
  • Explicit "Debates" section per chapter
  • Every closed-model number flagged with (vendor-reported)
  • Monthly tracker digest + WANTED gap list
  • Five reproduction notebooks at single-GPU scale
  • Sister-list scope split, boundary cases enumerated
Field map

How the eight chapters depend on each other

Solid arrows are mechanism dependencies. Dashed are open debates. Color marks the role: foundation (blue), inference-time (green), training-time (orange), failure modes (pink), synthesis (purple).

Field map of the eight chapters and their interconnections
Browse the literature

Two interactive registries

The papers and models are also exposed as filterable indexes — slice by chapter, year, status, open vs closed weights, training recipe. Both are powered by versioned JSON files in the repo.

Family tree

How today's reasoners trace back to their bases

The open and closed tracks both run base → SFT → RLVR → deployed reasoner, but only one side discloses the recipe. The dashed lavender arrow is R1's distillation trail — the largest single distribution-shift event the open ecosystem has seen.

Family tree of major reasoning models, 2024-2026
Reproductions you can run

Five notebooks, single-GPU runnable

Each notebook isolates one chapter's empirical claim and reproduces it at single-A10G scale (or CPU for the toys). Documented hardware, documented caveats, runnable end-to-end.

One opinion

"The chain of thought is the model's behavior, not its computation. Reasoning-model gains come from RL elicitation of latent capability, structured by the training distribution and amortized at inference. None of the three current theoretical frameworks alone is sufficient — and the most useful research moves the frontier where they conflict." — from Why do reasoning models work? A synthesis

Essays

Long-form syntheses

Where a survey paragraph isn't enough.

Auxiliary docs

Reading paths, glossary, BibTeX, model families

Cite the list

If this map is useful to your research

@misc{awesome_reasoning_models_theory_2026,
  title  = {Awesome Reasoning Models Theory: A theoretical and
            empirical map of the o-series / R1 / Claude-thinking paradigm},
  year   = {2026},
  url    = {https://github.com/bettyguo/awesome-reasoning-models-theory},
  note   = {Living document}
}

→ BibTeX for the anchor papers (per chapter)