| Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis | Jul 15, 2025 | MarketingOptical Character Recognition | —Unverified | 0 |
| A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends | Jul 14, 2025 | document understandingOptical Character Recognition | —Unverified | 0 |
| Logios : An open source Greek Polytonic Optical Character Recognition system | Jun 26, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages | Jun 22, 2025 | image-classificationImage Classification | —Unverified | 0 |
| An accurate and revised version of optical character recognition-based speech synthesis using LabVIEW | Jun 18, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models | Jun 12, 2025 | Large Language ModelOptical Character Recognition | —Unverified | 0 |
| Task-driven real-world super-resolution of document scans | Jun 8, 2025 | Image Super-ResolutionMulti-Task Learning | —Unverified | 0 |
| Reading in the Dark with Foveated Event Vision | Jun 7, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories | Jun 5, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 2 |
| SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text Recognition | May 30, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition | May 29, 2025 | Handwritten Mathmatical Expression RecognitionLanguage Modeling | CodeCode Available | 1 |
| TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance | May 29, 2025 | Image Super-ResolutionOptical Character Recognition | —Unverified | 0 |
| MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning | May 26, 2025 | document understandingMachine Translation | —Unverified | 0 |
| Words as Geometric Features: Estimating Homography using Optical Character Recognition as Compressed Image Representation | May 25, 2025 | Anomaly DetectionHomography Estimation | —Unverified | 0 |
| How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads | May 21, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Every Pixel Tells a Story: End-to-End Urdu Newspaper OCR | May 20, 2025 | ArticlesImage Super-Resolution | —Unverified | 0 |
| Reasoning-OCR: Can Large Multimodal Models Solve Complex Logical Reasoning Problems from OCR Cues? | May 19, 2025 | Logical ReasoningOptical Character Recognition | CodeCode Available | 1 |
| LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images? | May 18, 2025 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 1 |
| Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline | May 16, 2025 | Abstractive Text SummarizationLanguage Modeling | CodeCode Available | 0 |
| PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language | May 15, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 0 |
| A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court | May 13, 2025 | DiversityDocument Layout Analysis | —Unverified | 0 |
| Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction | May 12, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Development of a WAZOBIA-Named Entity Recognition System | May 10, 2025 | Machine Translationnamed-entity-recognition | —Unverified | 0 |
| Arrow-Guided VLM: Enhancing Flowchart Understanding via Arrow Direction Encoding | May 9, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Toward Advancing License Plate Super-Resolution in Real-World Scenarios: A Dataset and Benchmark | May 9, 2025 | License Plate RecognitionOptical Character Recognition | CodeCode Available | 0 |