vlm ocr survey / introduction
Strong, open VLMs enabled an explosion of open OCR model releases, with little sign of things letting up. In this survey, I detail the models, their evaluation, research trends, and open questions.
Probing the supported output types of Gemini.
Navigating Gemini's API for object detection with vision and Structured Outputs.
Thoughts on averaged benchmarks and hidden correlations.
tinyhnsw / introduction
The first post in the TinyHNSW series, introducing the tutorial and the library.