SOTAVerified

Optical Character Recognition (OCR)

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Papers

Showing 851900 of 1209 papers

TitleStatusHype
Making the V in Text-VQA Matter0
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining0
MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary0
MathWriting: A Dataset For Handwritten Mathematical Expression Recognition0
Matics Software Suite: New Tools for Evaluation and Data Exploration0
MatriVasha: A Multipurpose Comprehensive Database for Bangla Handwritten Compound Characters0
Measuring Contextual Fitness Using Error Contexts Extracted from the Wikipedia Revision History0
Measuring Innovation in Speech and Language Processing Publications.0
Measuring Lexical Quality of a Historical Finnish Newspaper Collection ― Analysis of Garbled OCR Data with Basic Language Technology Tools and Means0
Membership Model Inversion Attacks for Deep Networks0
Meme Sentiment Analysis Enhanced with Multimodal Spatial Encoding and Facial Embedding0
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset0
MenuAI: Restaurant Food Recommendation System via a Transformer-based Deep Learning Model0
Mero Nagarikta: Advanced Nepali Citizenship Data Extractor with Deep Learning-Powered Text Detection and OCR0
MinD at SemEval-2021 Task 6: Propaganda Detection using Transfer Learning and Multimodal Fusion0
Mind the Gap: Analyzing Lacunae with Transformer-Based Transcription0
Supporting Land Reuse of Former Open Pit Mining Sites using Text Classification and Active Learning0
MIRAGE: Multimodal Identification and Recognition of Annotations in Indian General Prescriptions0
Mitigating Noisy Inputs for Question Answering0
Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning0
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning0
MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding0
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents0
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency0
MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark0
MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding0
MMR: Evaluating Reading Ability of Large Multimodal Models0
Morphological annotation of Old and Middle Hungarian corpora0
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding0
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding0
MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning0
MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition0
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation0
Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition0
Multi-Input Attention for Unsupervised OCR Correction0
Multikernel activation functions: formulation and a case study0
Multilingual Named Entity Recognition for Medieval Charters Using Stacked Embeddings and Bert-based Models.0
Multimodal Sentiment Analysis: Perceived vs Induced Sentiments0
Multimodal Short Video Rumor Detection System Based on Contrastive Learning0
Multimodal Transformer for Comics Text-Cloze0
Multi-modular domain-tailored OCR post-correction0
Multiple-Question Multiple-Answer Text-VQA0
Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator0
Multistep Automated Data Labelling Procedure (MADLaP) for Thyroid Nodules on Ultrasound: An Artificial Intelligence Approach for Automating Image Annotation0
Multi-Task Learning for Improved Discriminative Training in SMT0
Named Entity Recognition and Correction in OCRized Corpora (D\'etection et correction automatique d'entit\'es nomm\'ees dans des corpus OCRis\'es) [in French]0
Named Entity Recognition in Historic Legal Text: A Transformer and State Machine Ensemble Method0
Named Entity Recognition in the Legal Domain using a Pointer Generator Network0
NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts0
Show:102550
← PrevPage 18 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DTrOCRAccuracy (%)89.6Unverified
2DTrOCR 105MAccuracy (%)89.6Unverified
3MaskOCR-LAccuracy (%)82.6Unverified
4TransOCRAccuracy (%)72.8Unverified
5SRNAccuracy (%)65Unverified
6MORANAccuracy (%)64.3Unverified
7SEEDAccuracy (%)61.2Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4oAverage Accuracy76.22Unverified
2Gemini-1.5 ProAverage Accuracy76.13Unverified
3Claude-3 SonnetAverage Accuracy67.71Unverified
4RapidOCRAverage Accuracy56.98Unverified
5EasyOCRAverage Accuracy49.3Unverified
#ModelMetricClaimedVerifiedStatus
1STREETSequence error27.54Unverified
2SEESequence error22Unverified
3AttentionOCR_Inception-resnet-v2_LocationSequence error15.8Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-NOPOOLBLEU89.09Unverified
2I2L-STRIPSBLEU89Unverified
#ModelMetricClaimedVerifiedStatus
1TesseractCharacter Error Rate (CER)0.08Unverified
2EasyOCRCharacter Error Rate (CER)0.07Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-STRIPSBLEU88.86Unverified