Skip to the content.

Reasoning model families (catalog)

A short reference catalog of the major reasoning-model families circa 2025–2026. Not exhaustive — the goal is to disambiguate names that appear across chapters and essays.

For each family: release date(s), open-vs-closed status, key public claims, and where in this list it’s discussed.


Closed-weights families

OpenAI o-series

Status: 🔴 closed weights, vendor-reported numbers. Cited carefully throughout the list; never as “the canonical scaling curve.”

Anthropic Claude with extended thinking

Status: 🟡 partial — methods writeups are substantive, weights closed. Faithfulness work (Anthropic 2025) gives some research-grade access.

Google Gemini reasoning variants

Status: 🔴 closed. Connected to the DeepMind AlphaProof/AlphaGeometry line (Chapter 4) on the formal-proof side.


Open-weights families

DeepSeek

Status: 🟢 open weights, open recipe. The empirical anchor for most of this list.

Qwen reasoning variants

Status: 🟢 open. Frequently used as a base in open RLVR work; appears in notebooks 01–04.

Alibaba / Tongyi reasoning models

Several reasoning-mode variants of the Tongyi family in 2025–2026; treated as the Qwen sibling here.

Open R1 reproductions

Status: 🟢 open. Listed in Chapter 5.

Tülu / AI2

Status: 🟢 open.

Mistral reasoning variants

Mistral has released several reasoning-tuned models in 2025; the lineage is less consolidated than DeepSeek/Qwen and is treated paper-by-paper.


Formal-proof systems (a separate clade)

DeepMind AlphaProof + AlphaGeometry 2

Cited in Chapter 4 as the regime where explicit search and RL are complementary, not competing.

Other formal-proof RL systems

Several 2025 follow-ups on Lean / Coq with RL and tree search; tracked under WANTED.md.


How to use this catalog


What this catalog deliberately omits

Filed 2026-05-14. Update as the field’s families consolidate or split. PR-friendly.