TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document Mar 7, 2024 document understanding Key Information Extraction
Code Code Available 5VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization Apr 30, 2024 Domain Adaptation Domain Generalization
Code Code Available 2Bridging the Gap Between End-to-End and Two-Step Text Spotting Apr 6, 2024 Text Spotting
Code Code Available 2Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis Oct 25, 2023 Text Spotting
Code Code Available 2DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting May 31, 2023 Decoder Scene Text Detection
Code Code Available 2DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting Nov 19, 2022 Decoder Scene Text Detection
Code Code Available 2SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition Mar 19, 2022 Scene Text Detection Text Detection
Code Code Available 2GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking May 28, 2025 Benchmarking Text Spotting
Code Code Available 1SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting Apr 14, 2025 Domain Adaptation Text Detection
Code Code Available 1TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification Mar 9, 2025 Robot Navigation STS
Code Code Available 1DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training Aug 1, 2024 Denoising Graph Matching
Code Code Available 1SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting Jan 15, 2024 Text Detection Text Spotting
Code Code Available 1GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching Jan 13, 2024 Text Detection Text Spotting
Code Code Available 1Parrot Captions Teach CLIP to Spot Text Dec 21, 2023 Representation Learning text similarity
Code Code Available 1ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer Aug 20, 2023 Decoder Text Detection
Code Code Available 1FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation May 5, 2023 Optical Flow Estimation Text Spotting
Code Code Available 1Scalable Mask Annotation for Video Text Spotting May 2, 2023 Text Spotting
Code Code Available 1Towards Unified Scene Text Spotting based on Sequence Generation Apr 7, 2023 Text Spotting
Code Code Available 1Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training Jan 5, 2023 Contrastive Learning Text Spotting
Code Code Available 1SPTS v2: Single-Point Scene Text Spotting Jan 4, 2023 Decoder Text Detection
Code Code Available 1ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting Nov 19, 2022 Blocking Language Modeling
Code Code Available 1GLASS: Global to Local Attention for Scene-Text Spotting Aug 5, 2022 Text Detection Text Spotting
Code Code Available 1Text Spotting Transformers Apr 5, 2022 Text Detection Text Spotting
Code Code Available 1End-to-End Video Text Spotting with Transformer Mar 20, 2022 Text Detection Text Spotting
Code Code Available 1SPTS: Single-Point Text Spotting Dec 15, 2021 Language Modelling Text Detection
Code Code Available 1A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer Dec 9, 2021 text annotation Text Spotting
Code Code Available 1TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text Representation Oct 25, 2021 Scene Text Detection Scene Text Recognition
Code Code Available 1Dictionary-Guided Scene Text Recognition Jun 19, 2021 Scene Text Detection Scene Text Recognition
Code Code Available 1ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting May 8, 2021 Text Spotting
Code Code Available 1PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text May 2, 2021 Scene Text Detection Text Detection
Code Code Available 1Scene Text Retrieval via Joint Text Detection and Similarity Learning Apr 4, 2021 Retrieval Scene Text Detection
Code Code Available 1Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution Jan 24, 2021 3D Feature Matching document understanding
Code Code Available 1AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting Aug 3, 2020 Language Modelling Sentence
Code Code Available 1Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting Jul 18, 2020 Region Proposal Text Spotting
Code Code Available 1ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network Feb 24, 2020 Scene Text Detection Text Detection
Code Code Available 1ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVT Sep 17, 2019 Text Detection Text Spotting
Code Code Available 1Text-Aware Image Restoration with Diffusion Models Jun 11, 2025 Denoising Hallucination
— Unverified 0OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models Feb 22, 2025 document understanding Key Information Extraction
Code Code Available 0CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR Jan 1, 2025 All Optical Character Recognition
— Unverified 0Hear the Scene: Audio-Enhanced Text Spotting Dec 27, 2024 Text Spotting
— Unverified 0InstructOCR: Instruction Boosting Scene Text Spotting Dec 20, 2024 Optical Character Recognition (OCR) Text Spotting
Code Code Available 0Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance Dec 13, 2024 Scene Text Recognition Text Spotting
— Unverified 0HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction Nov 2, 2024 Image Reconstruction Optical Character Recognition (OCR)
— Unverified 0FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting Aug 27, 2024 Benchmarking Decoder
Code Code Available 0WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting Jul 28, 2024 Contrastive Learning Text Spotting
Code Code Available 0CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction Jul 23, 2024 Image Inpainting Image Restoration
— Unverified 0Block-level Text Spotting with LLMs Jun 19, 2024 Language Modeling Language Modelling
— Unverified 0LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model May 29, 2024 Position Text Spotting
— Unverified 0Mixed Text Recognition with Efficient Parameter Fine-Tuning and Transformer Apr 19, 2024 Decoder Optical Character Recognition
— Unverified 0Ensemble Learning for Vietnamese Scene Text Spotting in Urban Environments Apr 1, 2024 Ensemble Learning Text Detection
— Unverified 0