olmOCR Bench
last updated 2026-05-27
Introduced in the olmOCR paper.
olmOCR-Bench, 1400 PDFs, 7000 unit tests for evaluation. A super interesting and simple evaluation approach (several binary judgements per page, compute accuracy) rather than the hodgepodge of edit-distance-based metrics.
Used to evaluate: