TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document Mar 7, 2024 document understanding Key Information Extraction
Code Code Available 55 VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization Apr 30, 2024 Domain Adaptation Domain Generalization
Code Code Available 25 DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting Nov 19, 2022 Decoder Scene Text Detection
Code Code Available 25 Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis Oct 25, 2023 Text Spotting
Code Code Available 25 DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting May 31, 2023 Decoder Scene Text Detection
Code Code Available 25 Bridging the Gap Between End-to-End and Two-Step Text Spotting Apr 6, 2024 Text Spotting
Code Code Available 25 SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition Mar 19, 2022 Scene Text Detection Text Detection
Code Code Available 25 ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer Aug 20, 2023 Decoder Text Detection
Code Code Available 15 SPTS: Single-Point Text Spotting Dec 15, 2021 Language Modelling Text Detection
Code Code Available 15 Text Spotting Transformers Apr 5, 2022 Text Detection Text Spotting
Code Code Available 15 ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting Nov 19, 2022 Blocking Language Modeling
Code Code Available 15 Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting Jul 18, 2020 Region Proposal Text Spotting
Code Code Available 15 Scalable Mask Annotation for Video Text Spotting May 2, 2023 Text Spotting
Code Code Available 15 SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting Apr 14, 2025 Domain Adaptation Text Detection
Code Code Available 15 GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching Jan 13, 2024 Text Detection Text Spotting
Code Code Available 15 GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking May 28, 2025 Benchmarking Text Spotting
Code Code Available 15 FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation May 5, 2023 Optical Flow Estimation Text Spotting
Code Code Available 15 Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution Jan 24, 2021 3D Feature Matching document understanding
Code Code Available 15 A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer Dec 9, 2021 text annotation Text Spotting
Code Code Available 15 Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training Jan 5, 2023 Contrastive Learning Text Spotting
Code Code Available 15 End-to-End Video Text Spotting with Transformer Mar 20, 2022 Text Detection Text Spotting
Code Code Available 15 GLASS: Global to Local Attention for Scene-Text Spotting Aug 5, 2022 Text Detection Text Spotting
Code Code Available 15 Towards Unified Scene Text Spotting based on Sequence Generation Apr 7, 2023 Text Spotting
Code Code Available 15 PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text May 2, 2021 Scene Text Detection Text Detection
Code Code Available 15 ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network Feb 24, 2020 Scene Text Detection Text Detection
Code Code Available 15 Parrot Captions Teach CLIP to Spot Text Dec 21, 2023 Representation Learning text similarity
Code Code Available 15 Dictionary-Guided Scene Text Recognition Jun 19, 2021 Scene Text Detection Scene Text Recognition
Code Code Available 15 AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting Aug 3, 2020 Language Modelling Sentence
Code Code Available 15 Scene Text Retrieval via Joint Text Detection and Similarity Learning Apr 4, 2021 Retrieval Scene Text Detection
Code Code Available 15 DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training Aug 1, 2024 Denoising Graph Matching
Code Code Available 15 ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting May 8, 2021 Text Spotting
Code Code Available 15 ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVT Sep 17, 2019 Text Detection Text Spotting
Code Code Available 15 SPTS v2: Single-Point Scene Text Spotting Jan 4, 2023 Decoder Text Detection
Code Code Available 15 SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting Jan 15, 2024 Text Detection Text Spotting
Code Code Available 15 TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text Representation Oct 25, 2021 Scene Text Detection Scene Text Recognition
Code Code Available 15 TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification Mar 9, 2025 Robot Navigation STS
Code Code Available 15 Semantic Relatedness Based Re-ranker for Text Spotting Sep 17, 2019 Clustering Dimensionality Reduction
Code Code Available 05 PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network Apr 12, 2021 Decoder Optical Character Recognition (OCR)
Code Code Available 05 Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance Oct 2, 2023 Scene Text Detection Text Detection
Code Code Available 05 Real-time End-to-End Video Text Spotter with Contrastive Representation Learning Jul 18, 2022 Contrastive Learning GPU
Code Code Available 05 OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition Mar 28, 2024 Decoder document understanding
Code Code Available 05 FOTS: Fast Oriented Text Spotting with a Unified Network Jan 5, 2018 Scene Text Detection Scene Text Recognition
Code Code Available 05 Single Shot Self-Reliant Scene Text Spotter by Decoupled yet Collaborative Detection and Recognition Jul 15, 2022 Text Detection Text Spotting
Code Code Available 05 OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition Jan 1, 2024 Decoder document understanding
Code Code Available 05 GloTSFormer: Global Video Text Spotting Transformer Jan 8, 2024 Text Spotting
Code Code Available 05 Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes Aug 22, 2019 Scene Text Recognition Semantic Segmentation
Code Code Available 05 FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting Aug 27, 2024 Benchmarking Decoder
Code Code Available 05 Extremely Low-light Image Enhancement with Scene Text Restoration Apr 1, 2022 Image Enhancement Image Restoration
Code Code Available 05 OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models Feb 22, 2025 document understanding Key Information Extraction
Code Code Available 05 ICDAR 2021 Competition on Integrated Circuit Text Spotting and Aesthetic Assessment Jul 12, 2021 Text Spotting
Code Code Available 05