SOTAVerified

Scene Text Recognition

See Scene Text Detection for leaderboards in this task.

Papers

Showing 51100 of 269 papers

TitleStatusHype
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text RecognitionCode1
Data Augmentation for Scene Text RecognitionCode1
SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable ExtractorCode1
Decoupled Attention Network for Text RecognitionCode1
Relational Contrastive Learning for Scene Text RecognitionCode1
Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text RecognitionCode1
Dictionary-Guided Scene Text RecognitionCode1
PIMNet: A Parallel, Iterative and Mimicking Network for Scene Text RecognitionCode1
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document EnhancementCode1
Self-supervised Character-to-Character Distillation for Text RecognitionCode1
Looking and Listening: Audio Guided Text RecognitionCode1
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text RecognitionCode1
AutoSTR: Efficient Backbone Search for Scene Text RecognitionCode1
Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text RecognitionCode1
Scene Text Recognition Models Explainability Using Local FeaturesCode1
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text RecognitionCode1
Self-supervised Implicit Glyph Attention for Text RecognitionCode1
Efficient scene text image super-resolution with semantic guidanceCode1
B-Spline Texture Coefficients Estimator for Screen Content Image Super-ResolutionCode1
SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text RecognitionCode1
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text RecognitionCode1
MRN: Multiplexed Routing Network for Incremental Multilingual Text RecognitionCode1
CentripetalText: An Efficient Text Instance Representation for Scene Text DetectionCode1
Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth EvaluationCode1
Symmetrical Linguistic Feature Distillation with CLIP for Scene Text RecognitionCode1
MASTER: Multi-Aspect Non-local Network for Scene Text RecognitionCode1
Show, Attend and Read: A Simple and Strong Baseline for Irregular Text RecognitionCode1
Meta Self-Learning for Multi-Source Domain Adaptation: A BenchmarkCode1
Text Gestalt: Stroke-Aware Scene Text Image Super-ResolutionCode1
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text RecognizerCode1
From Two to One: A New Scene Text Recognizer with Visual Language Modeling NetworkCode1
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text SpottingCode1
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal WorldCode1
Bidirectional Scene Text Recognition with a Single DecoderCode0
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and BeyondCode0
SAFL: A Self-Attention Scene Text Recognizer with Focal LossCode0
RobustScanner: Dynamically Enhancing Positional Clues for Robust Text RecognitionCode0
RewriteNet: Reliable Scene Text Editing with Implicit Decomposition of Text Contents and StylesCode0
Robust Scene Text Recognition with Automatic RectificationCode0
A Feasible Framework for Arbitrary-Shaped Scene Text RecognitionCode0
Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text RecognitionCode0
Relational Contrastive Learning and Masked Image Modeling for Scene Text RecognitionCode0
ASTER: An Attentional Scene Text Recognizer with Flexible RectificationCode0
Decoder Pre-Training with only Text for Scene Text RecognitionCode0
A Holistic Representation Guided Attention Network for Scene Text RecognitionCode0
Reading Between the Lanes: Text VideoQA on the RoadCode0
Reading Scene Text in Deep Convolutional SequencesCode0
Revisiting Classification Perspective on Scene Text RecognitionCode0
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text RecognitionCode0
Instruction-Guided Scene Text RecognitionCode0
Show:102550
← PrevPage 2 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L*Accuracy99.42Unverified
2DTrOCR 105MAccuracy99.4Unverified
3CLIP4STR-L (DataComp-1B)Accuracy99Unverified
4CLIP4STR-LAccuracy98.5Unverified
5MGP-STRAccuracy98.5Unverified
6CCD-ViT-Small(ARD_2.8M)Accuracy98.3Unverified
7CLIP4STR-BAccuracy98.3Unverified
8CCD-ViT-Base(ARD_2.8M)Accuracy98.3Unverified
9MATRNAccuracy97.9Unverified
10S-GTRAccuracy97.8Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-H (DFN-5B)Accuracy99.1Unverified
2DTrOCR 105MAccuracy98.9Unverified
3CLIP4STR-B*Accuracy98.76Unverified
4CLIP4STR-L (DataComp-1B)Accuracy98.6Unverified
5MGP-STRAccuracy98.6Unverified
6CLIP4STR-LAccuracy98.5Unverified
7CPPDAccuracy98.5Unverified
8CLIP4STR-BAccuracy98.3Unverified
9CCD-ViT-Base(ARD_2.8M)Accuracy97.8Unverified
10CCD-ViT-Small(ARD_2.8M)Accuracy96.4Unverified
#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy93.5Unverified
2CLIP4STR-L*Accuracy92.6Unverified
3CPPDAccuracy91.7Unverified
4CLIP4STR-L (DataComp-1B)Accuracy91.4Unverified
5MGP-STRAccuracy90.9Unverified
6CLIP4STR-LAccuracy90.8Unverified
7CLIP4STR-BAccuracy90.6Unverified
8SIGA_SAccuracy87.6Unverified
9S-GTRAccuracy87.3Unverified
10MATRNAccuracy86.6Unverified
#ModelMetricClaimedVerifiedStatus
1CPPDAccuracy99.7Unverified
2CLIP4STR-L (DataComp-1B)Accuracy99.7Unverified
3CLIP4STR-B*Accuracy99.65Unverified
4MGP-STRAccuracy99.31Unverified
5CLIP4STR-BAccuracy99.3Unverified
6DTrOCR 105MAccuracy99.1Unverified
7CLIP4STR-LAccuracy99Unverified
8CCD-ViT-Base(ARD_2.8M)Accuracy98.3Unverified
9CCD-ViT-Small(ARD_2.8M)Accuracy98.3Unverified
10CCD-ViT-Tiny(ARD_2.8M)Accuracy95.8Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L (DataComp-1B)Accuracy99.6Unverified
2DTrOCR 105MAccuracy99.6Unverified
3CLIP4STR-B (DataComp-1B)Accuracy99.5Unverified
4CLIP4STR-LAccuracy99.5Unverified
5CPPDAccuracy99.3Unverified
6CLIP4STR-BAccuracy99.2Unverified
7MGP-STRAccuracy98.8Unverified
8CCD-ViT-Base(ARD_2.8M)Accuracy98Unverified
9CCD-ViT-Small(ARD_2.8M)Accuracy98Unverified
10S-GTRAccuracy97.5Unverified
#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy98.6Unverified
2MGP-STRAccuracy98.3Unverified
3CLIP4STR-L*Accuracy98.13Unverified
4CLIP4STR-L (DataComp-1B)Accuracy98.1Unverified
5CLIP4STR-LAccuracy97.4Unverified
6CLIP4STR-BAccuracy97.2Unverified
7CPPDAccuracy96.7Unverified
8CCD-ViT-BaseAccuracy96.1Unverified
9CCD-ViT-SmallAccuracy92.7Unverified
10CCD-ViT-TinyAccuracy91.6Unverified
#ModelMetricClaimedVerifiedStatus
1Yet Another Text RecognizerAccuracy97.1Unverified
2SIGA_TAccuracy97Unverified
3SATRNAccuracy96.7Unverified
4DANAccuracy95Unverified
5SAFLAccuracy95Unverified
6CSTRAccuracy94.8Unverified
7Baek et al.Accuracy94.4Unverified
8ViTSTRAccuracy94.3Unverified
9AONAccuracy91.5Unverified
10RAREAccuracy90.1Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-H (DFN-5B)1:1 Accuracy90.9Unverified
2CLIP4STR-L (DataComp-1B)1:1 Accuracy90.6Unverified
3CLIP4STR-L1:1 Accuracy88.8Unverified
4CLIP4STR-B1:1 Accuracy87Unverified
5CCD-ViT-Base1:1 Accuracy86Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L (DataComp-1B)Accuracy (%)86.4Unverified
2CLIP4STR-LAccuracy (%)85.9Unverified
3CLIP4STR-BAccuracy (%)85.8Unverified
4MGP-STRAccuracy (%)85.5Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L1:1 Accuracy81.9Unverified
2MGP-STR1:1 Accuracy81.7Unverified
3CLIP4STR-B1:1 Accuracy81.1Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L1:1 Accuracy82.7Unverified
2CLIP4STR-B1:1 Accuracy79.8Unverified
3CCD-ViT-Base1:1 Accuracy77.3Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L (DataComp-1B)Accuracy (%)92.2Unverified
2MGP-STRAccuracy (%)91Unverified
3CLIP4STR-BAccuracy (%)86.8Unverified
#ModelMetricClaimedVerifiedStatus
1ABINet-LV+TPS++Accuracy97.8Unverified
#ModelMetricClaimedVerifiedStatus
1MLDGAverage Accuracy19.02Unverified
#ModelMetricClaimedVerifiedStatus
1ABINet-LV+TPS++Accuracy89.6Unverified