
Chandra OCR, a state-of-the-art, open-source document intelligence model developed by Datalab.
Built on a Transformer-based multimodal architecture and optimized for performance using the vLLM inference engine, the model demonstrates benchmark-leading capabilities in processing challenging elements like tables, handwriting, and mathematical formulas.
The analysis concludes by discussing the model's self-hostable advantage for data sovereignty, while noting the constraints of its OpenRAIL license and high computational requirements for enterprise adoption.