Joe Barrow field_notes

Field Notes

olmOCR Bench

last updated 2026-05-27

Introduced in the olmOCR paper.

olmOCR-Bench, 1400 PDFs, 7000 unit tests for evaluation. A super interesting and simple evaluation approach (several binary judgements per page, compute accuracy) rather than the hodgepodge of edit-distance-based metrics.

Used to evaluate: