SOTAVerified

Scene Text Recognition

See Scene Text Detection for leaderboards in this task.

Papers

Showing 2650 of 269 papers

TitleStatusHype
Symmetrical Linguistic Feature Distillation with CLIP for Scene Text RecognitionCode1
Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text TranslationCode1
Relational Contrastive Learning for Scene Text RecognitionCode1
Towards Robust Scene Text Image Super-resolution via Explicit Location EnhancementCode1
Looking and Listening: Audio Guided Text RecognitionCode1
MRN: Multiplexed Routing Network for Incremental Multilingual Text RecognitionCode1
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language ModelCode1
Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text RecognitionCode1
TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text RecognitionCode1
B-Spline Texture Coefficients Estimator for Screen Content Image Super-ResolutionCode1
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text SpottingCode1
Masked Vision-Language Transformers for Scene Text RecognitionCode1
Self-supervised Character-to-Character Distillation for Text RecognitionCode1
Toward Understanding WordArt: Corner-Guided Transformer for Scene Text RecognitionCode1
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text RecognitionCode1
Multimodal Semi-Supervised Learning for Text RecognitionCode1
Pushing the Performance Limit of Scene Text Recognizer without Human AnnotationCode1
IterVM: Iterative Vision Modeling Module for Scene Text RecognitionCode1
SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware NormalizationCode1
Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol SearchingCode1
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document EnhancementCode1
Self-supervised Implicit Glyph Attention for Text RecognitionCode1
On the Cross-dataset Generalization in License Plate RecognitionCode1
Visual Semantics Allow for Textual Reasoning Better in Scene Text RecognitionCode1
Text Gestalt: Stroke-Aware Scene Text Image Super-ResolutionCode1
Show:102550
← PrevPage 2 of 11Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L*Accuracy99.42Unverified
2DTrOCR 105MAccuracy99.4Unverified
3CLIP4STR-L (DataComp-1B)Accuracy99Unverified
4MGP-STRAccuracy98.5Unverified
5CLIP4STR-LAccuracy98.5Unverified
6CLIP4STR-BAccuracy98.3Unverified
7CCD-ViT-Base(ARD_2.8M)Accuracy98.3Unverified
8CCD-ViT-Small(ARD_2.8M)Accuracy98.3Unverified
9MATRNAccuracy97.9Unverified
10S-GTRAccuracy97.8Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-H (DFN-5B)Accuracy99.1Unverified
2DTrOCR 105MAccuracy98.9Unverified
3CLIP4STR-B*Accuracy98.76Unverified
4MGP-STRAccuracy98.6Unverified
5CLIP4STR-L (DataComp-1B)Accuracy98.6Unverified
6CLIP4STR-LAccuracy98.5Unverified
7CPPDAccuracy98.5Unverified
8CLIP4STR-BAccuracy98.3Unverified
9CCD-ViT-Base(ARD_2.8M)Accuracy97.8Unverified
10CCD-ViT-Small(ARD_2.8M)Accuracy96.4Unverified
#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy93.5Unverified
2CLIP4STR-L*Accuracy92.6Unverified
3CPPDAccuracy91.7Unverified
4CLIP4STR-L (DataComp-1B)Accuracy91.4Unverified
5MGP-STRAccuracy90.9Unverified
6CLIP4STR-LAccuracy90.8Unverified
7CLIP4STR-BAccuracy90.6Unverified
8SIGA_SAccuracy87.6Unverified
9S-GTRAccuracy87.3Unverified
10MATRNAccuracy86.6Unverified
#ModelMetricClaimedVerifiedStatus
1CPPDAccuracy99.7Unverified
2CLIP4STR-L (DataComp-1B)Accuracy99.7Unverified
3CLIP4STR-B*Accuracy99.65Unverified
4MGP-STRAccuracy99.31Unverified
5CLIP4STR-BAccuracy99.3Unverified
6DTrOCR 105MAccuracy99.1Unverified
7CLIP4STR-LAccuracy99Unverified
8CCD-ViT-Base(ARD_2.8M)Accuracy98.3Unverified
9CCD-ViT-Small(ARD_2.8M)Accuracy98.3Unverified
10CCD-ViT-Tiny(ARD_2.8M)Accuracy95.8Unverified
#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy99.6Unverified
2CLIP4STR-L (DataComp-1B)Accuracy99.6Unverified
3CLIP4STR-LAccuracy99.5Unverified
4CLIP4STR-B (DataComp-1B)Accuracy99.5Unverified
5CPPDAccuracy99.3Unverified
6CLIP4STR-BAccuracy99.2Unverified
7MGP-STRAccuracy98.8Unverified
8CCD-ViT-Base(ARD_2.8M)Accuracy98Unverified
9CCD-ViT-Small(ARD_2.8M)Accuracy98Unverified
10S-GTRAccuracy97.5Unverified
#ModelMetricClaimedVerifiedStatus
1DTrOCR 105MAccuracy98.6Unverified
2MGP-STRAccuracy98.3Unverified
3CLIP4STR-L*Accuracy98.13Unverified
4CLIP4STR-L (DataComp-1B)Accuracy98.1Unverified
5CLIP4STR-LAccuracy97.4Unverified
6CLIP4STR-BAccuracy97.2Unverified
7CPPDAccuracy96.7Unverified
8CCD-ViT-BaseAccuracy96.1Unverified
9CCD-ViT-SmallAccuracy92.7Unverified
10CCD-ViT-TinyAccuracy91.6Unverified
#ModelMetricClaimedVerifiedStatus
1Yet Another Text RecognizerAccuracy97.1Unverified
2SIGA_TAccuracy97Unverified
3SATRNAccuracy96.7Unverified
4DANAccuracy95Unverified
5SAFLAccuracy95Unverified
6CSTRAccuracy94.8Unverified
7Baek et al.Accuracy94.4Unverified
8ViTSTRAccuracy94.3Unverified
9AONAccuracy91.5Unverified
10RAREAccuracy90.1Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-H (DFN-5B)1:1 Accuracy90.9Unverified
2CLIP4STR-L (DataComp-1B)1:1 Accuracy90.6Unverified
3CLIP4STR-L1:1 Accuracy88.8Unverified
4CLIP4STR-B1:1 Accuracy87Unverified
5CCD-ViT-Base1:1 Accuracy86Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L (DataComp-1B)Accuracy (%)86.4Unverified
2CLIP4STR-LAccuracy (%)85.9Unverified
3CLIP4STR-BAccuracy (%)85.8Unverified
4MGP-STRAccuracy (%)85.5Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L1:1 Accuracy81.9Unverified
2MGP-STR1:1 Accuracy81.7Unverified
3CLIP4STR-B1:1 Accuracy81.1Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L1:1 Accuracy82.7Unverified
2CLIP4STR-B1:1 Accuracy79.8Unverified
3CCD-ViT-Base1:1 Accuracy77.3Unverified
#ModelMetricClaimedVerifiedStatus
1CLIP4STR-L (DataComp-1B)Accuracy (%)92.2Unverified
2MGP-STRAccuracy (%)91Unverified
3CLIP4STR-BAccuracy (%)86.8Unverified
#ModelMetricClaimedVerifiedStatus
1ABINet-LV+TPS++Accuracy97.8Unverified
#ModelMetricClaimedVerifiedStatus
1MLDGAverage Accuracy19.02Unverified
#ModelMetricClaimedVerifiedStatus
1ABINet-LV+TPS++Accuracy89.6Unverified