Synthetic data powered the training of a new multilingual OCR model designed for speed. Hugging Face researchers used generated text images to overcome the scarcity of labeled real-world data. This approach reduces training costs and improves accuracy across diverse scripts. Practitioners can now deploy lightweight vision models for high-throughput text extraction.