Paper Notes: LightOnOCR1 and 2
last updated 2026-05-10
https://arxiv.org/pdf/2601.14251
RLVR like OlmOCR, open data set (16MM pages from the PDFA corpus)
last updated 2026-05-10
https://arxiv.org/pdf/2601.14251
RLVR like OlmOCR, open data set (16MM pages from the PDFA corpus)