SOTAVerified

Optical Character Recognition (OCR)

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Papers

Showing 601650 of 1209 papers

TitleStatusHype
Derivate-based Component-Trees for Multi-Channel Image Segmentation0
Design and Development of a Framework For Stroke-Based Handwritten Gujarati Font Generation0
Design and Implementation of an OCR-Powered Pipeline for Table Extraction from Invoices0
Detecting de minimis Code-Switching in Historical German Books0
D\'etection d'erreurs dans des transcriptions OCR de documents historiques par r\'eseaux de neurones r\'ecurrents multi-niveau (Combining character level and word level RNNs for post-OCR error detection)0
Detection Masking for Improved OCR on Noisy Documents0
Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition0
Recognition of Images of Korean Characters Using Embedded Networks0
Recognition of Text Image Using Multilayer Perceptron0
Recommending Scientific Videos based on Metadata Enrichment using Linked Open Data0
Reconnaissance d’entités nommées sur des sorties OCR bruitées : des pistes pour la désambiguïsation morphologique automatique (Resolution of entity linking issues on noisy OCR output : automatic disambiguation tracks)0
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild0
Reference-Based Post-OCR Processing with LLM for Diacritic Languages0
Refining Corpora from a Model Calibration Perspective for Chinese Spelling Correction0
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation0
Regularization and Kernelization of the Maximin Correlation Approach0
ReLayout: Towards Real-World Document Understanding via Layout-enhanced Pre-training0
Representing Online Handwriting for Recognition in Large Vision-Language Models0
Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition0
Resilience of Large Language Models for Noisy Instructions0
Resolving Referring Expressions in Images With Labeled Elements0
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition0
Resource Constrained Structured Prediction0
Resume Information Extraction via Post-OCR Text Processing0
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge0
Revisiting Noise in Natural Language Processing for Computational Social Science0
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking0
Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks0
Robust Text CAPTCHAs Using Adversarial Examples0
Rosetta: Large scale system for text detection and recognition in images0
SAHSOH@QALB-2015 Shared Task: A Rule-Based Correction Method of Common Arabic Native and Non-Native Speakers' Errors0
SAML-QC: a Stochastic Assessment and Machine Learning based QC technique for Industrial Printing0
SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text Recognition0
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents0
Scaling Automatic Extraction of Pseudocode0
Scatteract: Automated extraction of data from scatter plots0
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering0
Scene Text recognition with Full Normalization0
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild0
SciCapenter: Supporting Caption Composition for Scientific Figures with Machine-Generated Captions and Ratings0
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models0
Seeing the Signs: A Survey of Edge-Deployable OCR Models for Billboard Visibility Analysis0
See then Tell: Enhancing Key Information Extraction with Vision Grounding0
SEE: Towards Semi-SupervisedEnd-to-End Scene Text Recognition0
Segmentation-free Connectionist Temporal Classification loss based OCR Model for Text Captcha Classification0
Self-paced learning to improve text row detection in historical documents with missing labels0
Self-supervised Data Bootstrapping for Deep Optical Character Recognition of Identity Documents0
Semantic rule Web-based Diagnosis and Treatment of Vector-Borne Diseases using SWRL rules0
Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images0
Semi-automated annotation of page-based documents within the Genre and Multimodality framework0
Show:102550
← PrevPage 13 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy (%)89.6Unverified
2DTrOCRAccuracy (%)89.6Unverified
3MaskOCR-LAccuracy (%)82.6Unverified
4TransOCRAccuracy (%)72.8Unverified
5SRNAccuracy (%)65Unverified
6MORANAccuracy (%)64.3Unverified
7SEEDAccuracy (%)61.2Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4oAverage Accuracy76.22Unverified
2Gemini-1.5 ProAverage Accuracy76.13Unverified
3Claude-3 SonnetAverage Accuracy67.71Unverified
4RapidOCRAverage Accuracy56.98Unverified
5EasyOCRAverage Accuracy49.3Unverified
#ModelMetricClaimedVerifiedStatus
1STREETSequence error27.54Unverified
2SEESequence error22Unverified
3AttentionOCR_Inception-resnet-v2_LocationSequence error15.8Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-NOPOOLBLEU89.09Unverified
2I2L-STRIPSBLEU89Unverified
#ModelMetricClaimedVerifiedStatus
1TesseractCharacter Error Rate (CER)0.08Unverified
2EasyOCRCharacter Error Rate (CER)0.07Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-STRIPSBLEU88.86Unverified