SOTAVerified

Optical Character Recognition

Papers

Showing 125 of 526 papers

TitleStatusHype
Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis0
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends0
Logios : An open source Greek Polytonic Optical Character Recognition system0
Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages0
An accurate and revised version of optical character recognition-based speech synthesis using LabVIEW0
Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models0
Task-driven real-world super-resolution of document scans0
Reading in the Dark with Foveated Event Vision0
MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K CategoriesCode2
SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text Recognition0
Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression RecognitionCode1
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance0
MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning0
Words as Geometric Features: Estimating Homography using Optical Character Recognition as Compressed Image Representation0
How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads0
Every Pixel Tells a Story: End-to-End Urdu Newspaper OCR0
Reasoning-OCR: Can Large Multimodal Models Solve Complex Logical Reasoning Problems from OCR Cues?Code1
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?Code1
Low-Resource Language Processing: An OCR-Driven Summarization and Translation PipelineCode0
PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto LanguageCode0
A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court0
Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late InteractionCode0
Development of a WAZOBIA-Named Entity Recognition System0
Arrow-Guided VLM: Enhancing Flowchart Understanding via Arrow Direction EncodingCode0
Toward Advancing License Plate Super-Resolution in Real-World Scenarios: A Dataset and BenchmarkCode0
Show:102550
← PrevPage 1 of 22Next →

No leaderboard results yet.