SOTAVerified

Scene Text Detection

Scene Text Detection is a computer vision task that involves automatically identifying and localizing text within natural images or videos. The goal of scene text detection is to develop algorithms that can robustly detect and and label text with bounding boxes in uncontrolled and complex environments, such as street signs, billboards, or license plates.

Source: ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection

Papers

Showing 150 of 213 papers

TitleStatusHype
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale FusionCode7
Character Region Awareness for Text DetectionCode4
Real-time Scene Text Detection with Differentiable BinarizationCode2
Towards End-to-End Unified Scene Text Detection and Layout AnalysisCode2
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text SpottingCode2
DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text SpottingCode2
SRFormer: Text Detection Transformer with Incorporated Segmentation and RegressionCode2
Revisiting Tampered Scene Text Detection in the Era of Generative AICode2
Turning a CLIP Model into a Scene Text DetectorCode2
SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text RecognitionCode2
DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in TransformerCode2
Turning a CLIP Model into a Scene Text SpotterCode2
TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text RepresentationCode1
The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text DetectionCode1
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal WorldCode1
TextFuseNet: Scene Text Detection with Richer Fused FeaturesCode1
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation NetworkCode1
TextRay: Contour-based Geometric Modeling for Arbitrary-shaped Scene Text DetectionCode1
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM PretrainingCode1
Polygon-free: Unconstrained Scene Text Detection with Box AnnotationsCode1
Bridging Synthetic and Real Worlds for Pre-training Scene Text DetectorsCode1
Shape Robust Text Detection with Progressive Scale Expansion NetworkCode1
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve NetworkCode1
EAST: An Efficient and Accurate Scene Text DetectorCode1
Recurrent Generic Contour-based Instance Segmentation with Progressive LearningCode1
Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text Detection in the WildCode1
R^3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense KnowledgeCode1
RRPN++: Guidance Towards More Accurate Scene Text DetectionCode1
Shape Robust Text Detection with Progressive Scale Expansion NetworkCode1
Industrial Scene Text Detection with Refined Feature-attentive NetworkCode1
Fourier Contour Embedding for Arbitrary-Shaped Text DetectionCode1
MixNet: Toward Accurate Detection of Challenging Scene Text in the WildCode1
CORE-Text: Improving Scene Text Detection with Contrastive Relational ReasoningCode1
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation NetworkCode1
Arbitrary Shape Text Detection via Segmentation with Probability MapsCode1
Dictionary-Guided Scene Text RecognitionCode1
Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion ModelsCode1
I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text DetectionCode1
Omnidirectional Scene Text Detection with Sequential-free Box DiscretizationCode1
PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped TextCode1
ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text DetectionCode1
Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text DetectionCode1
FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel RepresentationCode1
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and SpottingCode1
Scene Text Retrieval via Joint Text Detection and Similarity LearningCode1
DeRPN: Taking a further step toward more general object detectionCode1
CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text DetectionCode1
Detecting Multi-Oriented Text with Corner-based Region ProposalsCode1
CentripetalText: An Efficient Text Instance Representation for Scene Text DetectionCode1
Comprehensive Studies for Arbitrary-shape Scene Text Detection0
Show:102550
← PrevPage 1 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1TextFuseNet (ResNeXt-101)F-Measure92.23Unverified
2CharNet H-88 (multi-scale)F-Measure91.55Unverified
3CharNet H-88 (single-scale)F-Measure90.97Unverified
4CharNet H-50 (multi-scale)F-Measure90.16Unverified
5SBDF-Measure90.1Unverified
6CharNet H-57 (multi-scale)F-Measure90.06Unverified
7FOTS MSF-Measure89.84Unverified
8CharNet H-50 (single-scale)F-Measure89.7Unverified
9CharNet H-57 (single-scale)F-Measure89.66Unverified
10PMTDF-Measure89.33Unverified
#ModelMetricClaimedVerifiedStatus
1MixNetF-Measure90.5Unverified
2SRFormer (ResNet-50)F-Measure90Unverified
3DPText-DETR (ResNet-50)F-Measure89Unverified
4TextFuseNet (ResNeXt-101)F-Measure87.5Unverified
5FAST-B-800F-Measure87.5Unverified
6I3CL + SSL(ResNet-50)F-Measure86.9Unverified
7CharNet H-88 (multi-scale)F-Measure86.5Unverified
8FAST-B-640F-Measure86.4Unverified
9DBNet++ (ResNet-50) (800)F-Measure86Unverified
10FAST-B-512F-Measure85.8Unverified
#ModelMetricClaimedVerifiedStatus
1MixNetF-Measure89.4Unverified
2FAST-B-736F-Measure87.3Unverified
3DBNet++ (ResNet-50) (736)F-Measure87.2Unverified
4FAST-S-736F-Measure86.4Unverified
5DBNet++ (ResNet-18) (736)F-Measure85.1Unverified
6FAST-T-736F-Measure84.9Unverified
7DB-ResNet-50 (736)F-Measure84.9Unverified
8FAST-T-512F-Measure84.5Unverified
9PANF-Measure84.1Unverified
10CRAFTF-Measure82.9Unverified
#ModelMetricClaimedVerifiedStatus
1MixNetF-Measure89.8Unverified
2SRFormer (ResNet-50)F-Measure89.6Unverified
3DPText-DETR (ResNet50)F-Measure88.8Unverified
4TextFuseNet (ResNeXt-101)F-Measure87.4Unverified
5I3CL + SSLF-Measure86.5Unverified
6PANF-Measure85Unverified
7FAST-B-640F-Measure84.2Unverified
8PAN-640F-Measure83.7Unverified
9CRAFTF-Measure83.5Unverified
10DB-ResNet50 (1024)F-Measure83.4Unverified
#ModelMetricClaimedVerifiedStatus
1CRAFTPrecision97.4Unverified
2TextFuseNet (ResNeXt-101)F-Measure94.61Unverified
3SPCNETF-Measure92.1Unverified
4Mask TextSpotterF-Measure91.7Unverified
5WordSup (VGG16-synth-icdar)F-Measure90.34Unverified
6STN-OCRF-Measure90.3Unverified
7PixelLink+VGG16 2s MSF-Measure88.1Unverified
8TextBoxes++_MSF-Measure88Unverified
9Corner Localization (multi-scale)F-Measure88Unverified
10Corner-based Region ProposalsF-Measure87.6Unverified
#ModelMetricClaimedVerifiedStatus
1PMTD*Precision84.42Unverified
2Corner Localization (single-scale)Precision83.8Unverified
3SBDPrecision82.75Unverified
4FOTS MSPrecision81.86Unverified
5CharNet H-88Precision81.27Unverified
6FOTSPrecision80.95Unverified
7SPCNETPrecision80.6Unverified
8CRAFTPrecision80.6Unverified
9PANPrecision80Unverified
10GNNetsPrecision79.63Unverified
#ModelMetricClaimedVerifiedStatus
1Corner-based Region ProposalsF-Measure59.1Unverified
2TextBoxes++_MSF-Measure58.72Unverified
3EAST + VGG16F-Measure39.45Unverified
4SSTDF-Measure37Unverified
5WordSup (VGG16-synth-coco)F-Measure36.8Unverified
6Yao et al.F-Measure33.31Unverified
#ModelMetricClaimedVerifiedStatus
1MixNetH-Mean79.7Unverified
2SRFormer (ResNet-50)H-Mean79.3Unverified
3TextFuseNet (ResNeXt-101)H-Mean78.6Unverified
4DPText-DETR (ResNet-50)H-Mean78.1Unverified
#ModelMetricClaimedVerifiedStatus
1BDNF-Measure93.36Unverified