Joe Barrow

Paper Notes: MEDUSA 2026-07-15

Tree-based attention for shared-prefix draft decoding.

Paper Notes: SpecInfer 2026-07-15

Tree-based attention for shared-prefix draft decoding.

Paper Notes: DeepSeek-v3 Technical Report 2026-07-13

The model that put DeepSeek on the map, plus their MTP usage.

Paper Notes: EAGLE-2 2026-07-13

A lightweight speculative decoding method from NVidia that decodes multiple drafts in parallel.

Paper Notes: EAGLE-3.1 2026-07-13

A bag of tricks to improve EAGLE 3

Paper Notes: EAGLE-3 2026-07-13

A lightweight speculative decoding method from NVIDIA that can learn from more data than its predecessor.

Paper Notes: EAGLE 2026-07-13

First generation of a lightweight speculative decoding method from NVIDIA that decodes multiple drafts in parallel.

Paper Notes: Multi-token Prediction 2026-07-13

Train a few more LM heads for speculative decoding.

Paper Notes: dFlash 2026-07-13

Block-causal diffusion for speculative decoding.

Paper Notes: dInfer 2026-07-13

Tree-structured inference for speculative decoding.

Paper Notes: dSpark 2026-07-13

In which Deepseek proposes a "lightly-autoregressive" speculative decoder with confidence predictions.

NVIDIA Triton for Serving Small Models 2026-07-08

Notes on using Triton to serve small models.

A Visual Guide to the Roofline Model 2026-06-26

My attempt at a simple/visual explanation of arithmetic intensity and the roofline model.

Paper Notes: FastContext 2026-06-16

A small sub-agent for finding context, released by Microsoft.

Nvidia MIG 2026-06-11

How do you split a single (datacenter) GPU into multiple?

Paper Notes: LocateAnything 2026-06-08

NVIDIA object detection with VLMs paper, specifically geared around fast inference with blockwise predictions.

Training a Custom EAGLE-3 Head 2026-06-08

Working through the BaseTen EAGLE-3 tutorial.

Paper Notes: MinerU-Popo 2026-05-30

Paper Notes: Slow Search 2026-05-29

Looking at a 2010 paper with fresh eyes: what does search look like on Mars?

Paper Notes: Nemotron Parse 1.1 2026-05-27

What happens when you pair a huge vision encoder (600M params) with a tiny text decoder (250M params)? Let's find out!

OmniDocBench 2026-05-27

1000 difficult to OCR pages, used as a canonical torture test.

Paper Notes: QED-Nano 2026-05-27

RL techniques for training a surprisingly powerful small prover.

Notes: Surya OCR 2 2026-05-27

A 650M param OCR model that's ~on par with LightOnOCR-2, and outputs boxes as well.

Paper Notes: OlmOCR 2 2026-05-27

An RLVR approach for training OCR.

olmOCR Bench 2026-05-27

A 1400 PDF benchmark that uses unit test rewards to compute accuracy.

Paper Notes: HunyuanOCR 2026-05-23

A performant 1B parameter OCR model, built on Hunyuan Large 0.5B.

Paper Notes: Kosmos-2.5 2026-05-23

A 1.3B VLM trained on over 350MM pages to output text block coordinates and their text.

Paper Notes: Hierarchical Speculative Decoding 2026-05-22

A training-free speedup for document parsing, by getting a good speculator

Paper Notes: dots.mOCR 2026-05-22

A 3B OCR model that also derenders charts, tables, and other graphical elements.

Paper Notes: DODO Diffusion OCR 2026-05-18

Diffusion OCR model

Paper Notes: MinerU-Diffusion 2026-05-18

Speeding up vLLM Start Times 2026-05-18

Adding a cache directory for vLLM docker can reduce start times to ~11s.

Paper Notes: LightOnOCR 2026-05-12

A 1B VLM, stitching together Qwen3-0.6B and SigLip-400M.

Paper Notes: OBLIQ-Bench 2026-05-12

A benchmark for the current frontier of retrievers: possible to verify with reasoning models, difficult to retrieve.

Paper Notes: DONUT 2026-05-11

Can you perform tasks over documents purely using the document image?

Paper Notes: Deepseek-OCR 2026-05-11

In which DeepSeek argues that document images can be more dense, lossless input representations.

Paper Notes: Dolphin OCR 2026-05-11

A tiny, two-stage, DONUT-based OCR model.

Paper Notes: Chandra 2026-05-10

TODO

Paper Notes: DR Tulu 2026-05-10

Evolving rubrics for training a small, powerful deep research model.

Paper Notes: Fox 2026-05-10

TODO

Paper Notes: GOT 2.0 2026-05-10

TODO

Paper Notes: Granite Docling 2026-05-10

TODO

Paper Notes: LightOnOCR 2 2026-05-10

An update to LightOnOCR, plus a bbox model for figures.

Paper Notes: MinerU2.5 2026-05-10

TODO

Paper Notes: MonkeyOCR-v1.5 2026-05-10

TODO

Paper Notes: MonkeyOCR 2026-05-10

TODO

Paper Notes: NanoNets OCR 2026-05-10

TODO

Paper Notes: NanoNets OCR2 2026-05-10

TODO

Paper Notes: Nougat 2026-05-10

TODO

Paper Notes: OCRFlux 2026-05-10

TODO

Paper Notes: PaddleOCR-VL 2026-05-10

TODO

Paper Notes: Pix2Struct 2026-05-10

TODO

Paper Notes: Qwen2-VL 2026-05-10

TODO

Paper Notes: Qwen2.5-VL 2026-05-10

TODO

Paper Notes: Qwen3-VL 2026-05-10

TODO

Paper Notes: RolmOCR 2026-05-10

A finetune of Qwen2.5-7B for OCR from Reducto

Paper Notes: TrOCR 2026-05-10

TODO

Paper Notes: Vary 2026-05-10

TODO

Paper Notes: dots.OCR 2026-05-10

TODO

Paper Notes: OlmOCR 2026-05-10

A truly open OCR model (dataset, model, code) based on Qwen2-VL-7B.

Field Notes