software

Open-source research code, agent infrastructure, and curated maps.

Open source is where the second half of the work lives: the system that meets each proven bound, the infrastructure the group runs on, and the maps I wish had existed when I started. Below is a curated selection across seven themes — the full set is on GitHub.

bettyguo on GitHub

38 curated repositories
8 peer-reviewed artifacts
7 research themes
Spotlight SIGIR '26

realm-retrieve

ReaLM-Retrieve. When to retrieve during reasoning, decided by an information-theoretic stopping rule rather than heuristics — the adaptive-RAG policy for large reasoning models.

View repository
117 stars on GitHub

Paper code 7

One repository per publication — the theory and the system that meets it, in the same artifact.

Post-Transformer architectures 6

An exploratory program testing five candidate sequence architectures beyond attention. They are designed to compose.

Agent & MCP infrastructure 8

Local-first when possible, verifiable when not.

Trustworthy & verification 4

Guarantees that survive an audit, not just a benchmark.

Evaluation & auditing 4

If a number can be gamed, assume it has been. Probes that check the benchmark before you trust the score.

Research maps & atlases 4

What I had to learn the hard way, verified and written down for the next person.

Interpretability & developer tools 4

Make model internals visible; keep the agent stack honest.