SOTAVerified

Optical Character Recognition (OCR)

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Papers

Showing 351400 of 1209 papers

TitleStatusHype
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document ParsingCode0
InstructOCR: Instruction Boosting Scene Text SpottingCode0
DDI-100: Dataset for Text Detection and RecognitionCode0
DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense UnderstandingCode0
Improving patch-based scene text script identification with ensembles of conjoined networksCode0
Data-Driven Spelling Correction using Weighted Finite-State MethodsCode0
Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and VotingCode0
Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic ManuscriptsCode0
Investigating OCR-Sensitive Neurons to Improve Entity Recognition in Historical DocumentsCode0
Data Centric Domain Adaptation for Historical Text with OCR ErrorsCode0
Implicit Language Model in LSTM for OCRCode0
Crossing Language Borders: A Pipeline for Indonesian Manhwa TranslationCode0
iExam: A Novel Online Exam Monitoring and Analysis System Based on Face Detection and RecognitionCode0
Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math TextbooksCode0
Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active LearningCode0
Attention-based Extraction of Structured Information from Street View ImageryCode0
High-Throughput Phenotyping using Computer Vision and Machine LearningCode0
Corpus for Coreference Resolution on Scientific PapersCode0
Document Image Cleaning using Budget-Aware Black-Box ApproximationCode0
A Gaussian Process Upsampling Model for Improvements in Optical Character RecognitionCode0
Enhancing Cross-task Transferability of Adversarial Examples with Dispersion ReductionCode0
Do Current Video LLMs Have Strong OCR Abilities? A Preliminary StudyCode0
Historical Ink: 19th Century Latin American Spanish Newspaper Corpus with LLM OCR CorrectionCode0
An agentic system with reinforcement-learned subsystem improvements for parsing form-like documentsCode0
CORD: A Consolidated Receipt Dataset for Post-OCR ParsingCode0
Convolution-based Probability Gradient Loss for Semantic SegmentationCode0
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based ReasoningCode0
Order-preserving Consistency Regularization for Domain Adaptation and GeneralizationCode0
Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural NetworkCode0
HENet: Forcing a Network to Think More for Font RecognitionCode0
Improving OCR Accuracy on Early Printed Books using Deep Convolutional NetworksCode0
DriveThru: a Document Extraction Platform and Benchmark Datasets for Indonesian Local Language ArchivesCode0
LOANet: A Lightweight Network Using Object Attention for Extracting Buildings and Roads from UAV Aerial Remote Sensing ImagesCode0
A Tool for Facilitating OCR Postediting in Historical DocumentsCode0
A template-independent approach for information extraction in real estate documentsCode0
Analyzing Green View Index and Green View Index best path using Google Street View and deep learningCode0
GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document UnderstandingCode0
From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character RecognitionCode0
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene TextCode0
Brno Mobile OCR DatasetCode0
From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-CorrectionCode0
Early evidence of how LLMs outperform traditional systems on OCR/HTR tasks for historical recordsCode0
Gated Recurrent Convolution Neural Network for OCRCode0
A Multi-Object Rectified Attention Network for Scene Text RecognitionCode0
Handwriting Classification for the Analysis of Art-Historical DocumentsCode0
EATEN: Entity-aware Attention for Single Shot Visual Text ExtractionCode0
Quantifying Character Similarity with Vision TransformersCode0
FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text SpottingCode0
FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo EmbeddingsCode0
A Survey of Deep Learning Approaches for OCR and Document UnderstandingCode0
Show:102550
← PrevPage 8 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DTrOCRAccuracy (%)89.6Unverified
2DTrOCR 105MAccuracy (%)89.6Unverified
3MaskOCR-LAccuracy (%)82.6Unverified
4TransOCRAccuracy (%)72.8Unverified
5SRNAccuracy (%)65Unverified
6MORANAccuracy (%)64.3Unverified
7SEEDAccuracy (%)61.2Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4oAverage Accuracy76.22Unverified
2Gemini-1.5 ProAverage Accuracy76.13Unverified
3Claude-3 SonnetAverage Accuracy67.71Unverified
4RapidOCRAverage Accuracy56.98Unverified
5EasyOCRAverage Accuracy49.3Unverified
#ModelMetricClaimedVerifiedStatus
1STREETSequence error27.54Unverified
2SEESequence error22Unverified
3AttentionOCR_Inception-resnet-v2_LocationSequence error15.8Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-NOPOOLBLEU89.09Unverified
2I2L-STRIPSBLEU89Unverified
#ModelMetricClaimedVerifiedStatus
1TesseractCharacter Error Rate (CER)0.08Unverified
2EasyOCRCharacter Error Rate (CER)0.07Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-STRIPSBLEU88.86Unverified