SOTAVerified

Optical Character Recognition (OCR)

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Papers

Showing 11511200 of 1209 papers

TitleStatusHype
Chandojnanam: A Sanskrit Meter Identification and Utilization SystemCode0
VisionThink: Smart and Efficient Vision Language Model via Reinforcement LearningCode0
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video UnderstandingCode0
Ekush: A Multipurpose and Multitype Comprehensive Database for Online Off-Line Bangla Handwritten CharactersCode0
NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text RecognitionCode0
An Unsupervised Normalization Algorithm for Noisy Text: A Case Study for Information Retrieval and Stance DetectionCode0
Centurio: On Drivers of Multilingual Ability of Large Vision-Language ModelCode0
TransDocs: Optical Character Recognition with word to word translationCode0
SUT: a new multi-purpose synthetic dataset for Farsi document image analysisCode0
Object detection deep learning networks for Optical Character RecognitionCode0
Efficient Video-Based ALPR System Using YOLO and Visual RhythmCode0
Relation-Rich Visual Document Generator for Visual Information ExtractionCode0
A Survey of Deep Learning Approaches for OCR and Document UnderstandingCode0
An Efficient and Layout-Independent Automatic License Plate Recognition System Based on the YOLO detectorCode0
Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late InteractionCode0
Case Study of a highly automated Layout Analysis and OCR of an incunabulum: 'Der Heiligen Leben' (1488)Code0
Crossing Language Borders: A Pipeline for Indonesian Manhwa TranslationCode0
Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual AdaptersCode0
Transfer Learning for OCRopus Model Training on Early Printed BooksCode0
Calibrated Structured PredictionCode0
A Gaussian Process Upsampling Model for Improvements in Optical Character RecognitionCode0
SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table ExtractionCode0
Syntactic Language Change in English and German: Metrics, Parsers, and ConvergencesCode0
Efficient License Plate Recognition in Videos Using Visual Rhythm and Accumulative Line AnalysisCode0
EATEN: Entity-aware Attention for Single Shot Visual Text ExtractionCode0
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question AnsweringCode0
Synthetic Document Question Answering in HungarianCode0
Corpus for Coreference Resolution on Scientific PapersCode0
Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character RecognitionCode0
CORD: A Consolidated Receipt Dataset for Post-OCR ParsingCode0
Early evidence of how LLMs outperform traditional systems on OCR/HTR tasks for historical recordsCode0
Advancing Post-OCR Correction: A Comparative Study of Synthetic DataCode0
Robust Scene Text Recognition with Automatic RectificationCode0
Building a Part-of-Speech Tagged Corpus for Drenjongke (Bhutia)Code0
Time-Aware Word Embeddings for Three Lebanese News ArchivesCode0
OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image ReasoningCode0
RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari LanguagesCode0
Convolution-based Probability Gradient Loss for Semantic SegmentationCode0
E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine TranslationCode0
SAFL: A Self-Attention Scene Text Recognizer with Focal LossCode0
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene TextCode0
Brno Mobile OCR DatasetCode0
Comparative analysis of optical character recognition methods for Sámi texts from the National Library of NorwayCode0
DuoSearch: A Novel Search Engine for Bulgarian Historical DocumentsCode0
ASTER: An Attentional Scene Text Recognizer with Flexible RectificationCode0
An agentic system with reinforcement-learned subsystem improvements for parsing form-like documentsCode0
Binary Document Image Super Resolution for Improved Readability and OCR PerformanceCode0
A Skip-connected Multi-column Network for Isolated Handwritten Bangla Character and Digit recognitionCode0
Adapting the Tesseract Open-Source OCR Engine for Tamil and Sinhala Legacy Fonts and Creating a Parallel Corpus for Tamil-Sinhala-EnglishCode0
BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata ExtractionCode0
Show:102550
← PrevPage 24 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DTrOCRAccuracy (%)89.6Unverified
2DTrOCR 105MAccuracy (%)89.6Unverified
3MaskOCR-LAccuracy (%)82.6Unverified
4TransOCRAccuracy (%)72.8Unverified
5SRNAccuracy (%)65Unverified
6MORANAccuracy (%)64.3Unverified
7SEEDAccuracy (%)61.2Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4oAverage Accuracy76.22Unverified
2Gemini-1.5 ProAverage Accuracy76.13Unverified
3Claude-3 SonnetAverage Accuracy67.71Unverified
4RapidOCRAverage Accuracy56.98Unverified
5EasyOCRAverage Accuracy49.3Unverified
#ModelMetricClaimedVerifiedStatus
1STREETSequence error27.54Unverified
2SEESequence error22Unverified
3AttentionOCR_Inception-resnet-v2_LocationSequence error15.8Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-NOPOOLBLEU89.09Unverified
2I2L-STRIPSBLEU89Unverified
#ModelMetricClaimedVerifiedStatus
1TesseractCharacter Error Rate (CER)0.08Unverified
2EasyOCRCharacter Error Rate (CER)0.07Unverified
#ModelMetricClaimedVerifiedStatus
1I2L-STRIPSBLEU88.86Unverified