SOTAVerified

Optical Character Recognition (OCR)

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Papers

Showing 751800 of 1209 papers

TitleStatusHype
Robust Text CAPTCHAs Using Adversarial Examples0
On-Device Document Classification using multimodal features0
End-to-End Piece-Wise Unwarping of Document Images0
Iranis: A Large-scale Dataset of Farsi License Plate CharactersCode1
NOSE Augment: Fast and Effective Data Augmentation Without Searching0
BROS: A Pre-trained Language Model for Understanding Texts in Document0
ConvMath: A Convolutional Sequence Network for Mathematical Expression Recognition0
Named Entity Recognition in the Legal Domain using a Pointer Generator Network0
Indonesian ID Card Extractor Using Optical Character Recognition and Natural Language Post-Processing0
FAWA: Fast Adversarial Watermark Attack on Optical Character Recognition (OCR) SystemsCode1
Discovering Airline-Specific Business Intelligence from Online Passenger Reviews: An Unsupervised Text Analytics Approach0
Vartani Spellcheck -- Automatic Context-Sensitive Spelling Correction of OCR-generated Hindi Text Using BERT and Levenshtein Distance0
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps0
TAP: Text-Aware Pre-training for Text-VQA and Text-CaptionCode1
Confidence-aware Non-repetitive Multimodal Transformers for TextCapsCode1
A Two-Step Approach for Automatic OCR Post-CorrectionCode1
Detecting de minimis Code-Switching in Historical German Books0
Ad Lingua: Text Classification Improves Symbolism Prediction in Image Advertisements0
Building a Part-of-Speech Tagged Corpus for Drenjongke (Bhutia)Code0
BennettNLP at SemEval-2020 Task 8: Multimodal sentiment classification Using Hybrid Hierarchical Classifier0
CSECU\_KDE\_MA at SemEval-2020 Task 8: A Neural Attention Model for Memotion Analysis0
SIS@IIITH at SemEval-2020 Task 8: An Overview of Simple Text Classification Methods for Meme Analysis0
Intrinsic Decomposition of Document Images In-the-WildCode1
A Survey of Deep Learning Approaches for OCR and Document UnderstandingCode0
A Panoramic Survey of Natural Language Processing in the Arab World0
SuperOCR: A Conversion from Optical Character Recognition to Image Captioning0
On-Device Text Image Super Resolution0
Clustering-based Automatic Construction of Legal Entity Knowledge Base from Contracts0
On-Device Language Identification of Text in Images using Diacritic Characters0
OCR Post Correction for Endangered Language TextsCode1
Automated data extraction of bar chart raster images0
An Unsupervised method for OCR Post-Correction and Spelling Normalisation for FinnishCode1
Handwriting Classification for the Analysis of Art-Historical DocumentsCode0
Automated Transcription of Non-Latin Script Periodicals: A Case Study in the Ottoman Turkish Print Archive0
OCR, Classification & Machine Translation (OCCAM)0
Chunk-based Chinese Spelling Check with Global Optimization0
Alleviating Digitization Errors in Named Entity Recognition for Historical DocumentsCode0
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question AnsweringCode1
Persian Handwritten Digit, Character and Word Recognition Using Deep Learning0
TLGAN: document Text Localization using Generative Adversarial NetsCode1
Boosting High-Level Vision with Joint Compression Artifacts Reduction and Super-Resolution0
DE-GAN: A Conditional Generative Adversarial Network for Document EnhancementCode1
A Conglomerate of Multiple OCR Table Detection and Extraction0
DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding0
Tokenization Repair in the Presence of Spelling ErrorsCode1
Table Structure Recognition using Top-Down and Bottom-Up CuesCode1
Finding the Evidence: Localization-aware Answer Prediction for Text Visual Question Answering0
A Large Multi-Target Dataset of Common Bengali Handwritten GraphemesCode1
Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition0
Towards Image-based Automatic Meter Reading in Unconstrained Scenarios: A Robust and Efficient Approach0
Show:102550
← PrevPage 16 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DTrOCRAccuracy (%)89.6Unverified
2DTrOCR 105MAccuracy (%)89.6Unverified
3MaskOCR-LAccuracy (%)82.6Unverified
4TransOCRAccuracy (%)72.8Unverified
5SRNAccuracy (%)65Unverified
6MORANAccuracy (%)64.3Unverified
7SEEDAccuracy (%)61.2Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4oAverage Accuracy76.22Unverified
2Gemini-1.5 ProAverage Accuracy76.13Unverified
3Claude-3 SonnetAverage Accuracy67.71Unverified
4RapidOCRAverage Accuracy56.98Unverified
5EasyOCRAverage Accuracy49.3Unverified
#ModelMetricClaimedVerifiedStatus
1STREETSequence error27.54Unverified
2SEESequence error22Unverified
3AttentionOCR_Inception-resnet-v2_LocationSequence error15.8Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-NOPOOLBLEU89.09Unverified
2I2L-STRIPSBLEU89Unverified
#ModelMetricClaimedVerifiedStatus
1TesseractCharacter Error Rate (CER)0.08Unverified
2EasyOCRCharacter Error Rate (CER)0.07Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-STRIPSBLEU88.86Unverified