SOTAVerified

Text Spotting

Scene Text Spotting is the combination of Scene Text Detection and Scene Text Recognition in an end-to-end manner. It is the ability to read natural text in the wild.

Papers

Showing 150 of 112 papers

TitleStatusHype
TextMonkey: An OCR-Free Large Multimodal Model for Understanding DocumentCode5
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text SpottingCode2
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain GeneralizationCode2
Hierarchical Text Spotter for Joint Text Spotting and Layout AnalysisCode2
Bridging the Gap Between End-to-End and Two-Step Text SpottingCode2
SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text RecognitionCode2
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text SpottingCode2
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in TransformerCode1
FlowText: Synthesizing Realistic Scene Text Video with Optical Flow EstimationCode1
SPTS: Single-Point Text SpottingCode1
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text SpottingCode1
Towards Unified Scene Text Spotting based on Sequence GenerationCode1
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-TrainingCode1
DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising TrainingCode1
GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and BenchmarkingCode1
A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with TransformerCode1
Towards Robust Visual Information Extraction in Real World: New Dataset and Novel SolutionCode1
SPTS v2: Single-Point Scene Text SpottingCode1
TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and VerificationCode1
Text Spotting TransformersCode1
TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text RepresentationCode1
GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term MatchingCode1
End-to-End Video Text Spotting with TransformerCode1
Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text SpottingCode1
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve NetworkCode1
GLASS: Global to Local Attention for Scene-Text SpottingCode1
Dictionary-Guided Scene Text RecognitionCode1
AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text SpottingCode1
SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text SpottingCode1
ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text SpottingCode1
Parrot Captions Teach CLIP to Spot TextCode1
ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVTCode1
PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped TextCode1
Scalable Mask Annotation for Video Text SpottingCode1
SwinTextSpotter v2: Towards Better Synergy for Scene Text SpottingCode1
Scene Text Retrieval via Joint Text Detection and Similarity LearningCode1
Mixed Text Recognition with Efficient Parameter Fine-Tuning and Transformer0
Block-level Text Spotting with LLMs0
Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes0
Beyond the Mud: Datasets and Benchmarks for Computer Vision in Off-Road Racing0
Deformation Robust Text Spotting with Geometric Prior0
DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting0
Hear the Scene: Audio-Enhanced Text Spotting0
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration0
ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter0
Deep Neural Network for Semantic-based Text Recognition in Images0
ICDAR 2023 Video Text Reading Competition for Dense and Small Text0
Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance0
A pooling based scene text proposal technique for scene text reading in the wild0
Inductive Visual Localisation: Factorised Training for Superior Generalisation0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1UNITSF-measure (%) - Strong Lexicon89Unverified
2DeepSolo (ViTAEv2-S, TextOCR)F-measure (%) - Strong Lexicon88.1Unverified
3DeepSolo(ResNet-50, TextOCR)F-measure (%) - Strong Lexicon88Unverified
4DeepSolo(ResNet-50)F-measure (%) - Strong Lexicon86.8Unverified
5SRTSF-measure (%) - Strong Lexicon85.6Unverified
6TESTRF-measure (%) - Strong Lexicon85.2Unverified
7A3SF-measure (%) - Strong Lexicon84.8Unverified
8GLASSF-measure (%) - Strong Lexicon84.7Unverified
9SwinTextSpotterF-measure (%) - Strong Lexicon83.9Unverified
10FOTSF-measure (%) - Strong Lexicon83.6Unverified
#ModelMetricClaimedVerifiedStatus
1DeepSolo (ViTAEv2-S, TextOCR)F-measure (%) - No Lexicon83.6Unverified
2DeepSolo (ResNet-50, TextOCR)F-measure (%) - No Lexicon82.5Unverified
3DeepSolo (ResNet-50)F-measure (%) - No Lexicon79.7Unverified
4A3SF-measure (%) - No Lexicon79.4Unverified
5UNITSF-measure (%) - No Lexicon78.7Unverified
6GLASSF-measure (%) - No Lexicon76.6Unverified
7DEERF-measure (%) - No Lexicon74.8Unverified
8SwinTextSpotterF-measure (%) - No Lexicon74.3Unverified
9TESTRF-measure (%) - No Lexicon73.3Unverified
10MANGOF-measure (%) - No Lexicon72.9Unverified
#ModelMetricClaimedVerifiedStatus
1A3SF-measure (%) - No Lexicon64.4Unverified
2DeepSolo (ResNet-50)F-measure (%) - No Lexicon64.2Unverified
3SPTSF-measure (%) - No Lexicon63.6Unverified
4ABINet++F-measure (%) - No Lexicon60.2Unverified
5TPSNetF-measure (%) - No Lexicon59.7Unverified
6MANGOF-measure (%) - No Lexicon58.9Unverified
7ABCNet v2F-measure (%) - No Lexicon57.5Unverified
8TextPerceptronF-measure (%) - No Lexicon57Unverified
9TESTRF-measure (%) - No Lexicon56Unverified
10SwinTextSpotterF-measure (%) - No Lexicon51.8Unverified
#ModelMetricClaimedVerifiedStatus
1DeepSolo (ViTAEv2-S, TextOCR)F-measure (%) - No Lexicon68.8Unverified
2DeepSolo (ResNet-50, TextOCR)F-measure (%) - No Lexicon64.6Unverified
3SwinTextSpotterF-measure (%) - No Lexicon55.4Unverified
4DeepSolo (ResNet-50)F-measure (%) - No Lexicon48.5Unverified
5MaskTextSpotter v2F-measure (%) - No Lexicon39Unverified
6SPTSF-measure (%) - No Lexicon38.3Unverified
7ABCNet v2F-measure (%) - No Lexicon34.5Unverified
8TESTRF-measure (%) - No Lexicon34.2Unverified
9ABCNetF-measure (%) - No Lexicon22.2Unverified