Working notes — things I'm reading, thinking about, and trying to figure out. Less polished than the long-form posts, sometimes revised in place.
TODO
TODO
Evolving rubrics for training a small, powerful deep research model.
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
Novel RL techniques for training a surprisingly powerful small prover.
TODO
TODO
TODO
TODO
TODO
TODO
TODO
No notes match the selected tags.