| An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition | Jul 21, 2015 | Optical Character Recognition (OCR)Scene Text Recognition | CodeCode Available | 4 |
| TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition | Dec 2, 2024 | Image GenerationOptical Character Recognition (OCR) | CodeCode Available | 2 |
| A General Framework for Jersey Number Recognition in Sports Video | May 22, 2024 | Jersey Number RecognitionScene Text Recognition | CodeCode Available | 2 |
| Text Image Inpainting via Global Structure-Guided Diffusion Models | Jan 26, 2024 | Image InpaintingScene Text Recognition | CodeCode Available | 2 |
| An Empirical Study of Scaling Law for Scene Text Recognition | Jan 1, 2024 | Optical Character Recognition (OCR)Scene Text Recognition | CodeCode Available | 2 |
| Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning | Sep 3, 2023 | Scene Text Recognition | CodeCode Available | 2 |
| Orientation-Independent Chinese Text Recognition in Scene Images | Sep 3, 2023 | BenchmarkingImage Reconstruction | CodeCode Available | 2 |
| DTrOCR: Decoder-only Transformer for Optical Character Recognition | Aug 30, 2023 | DecoderHandwritten Text Recognition | CodeCode Available | 2 |
| Revisiting Scene Text Recognition: A Data Perspective | Jul 17, 2023 | Scene Text Recognition | CodeCode Available | 2 |
| Scene Text Recognition with Permuted Autoregressive Sequence Models | Jul 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GIT: A Generative Image-to-text Transformer for Vision and Language | May 27, 2022 | DecoderImage Captioning | CodeCode Available | 2 |
| Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition | Mar 24, 2025 | Contrastive LearningScene Text Recognition | CodeCode Available | 1 |
| Ocean-OCR: Towards General OCR Application via a Vision-Language Model | Jan 26, 2025 | document understandingLanguage Modeling | CodeCode Available | 1 |
| Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition | Oct 13, 2024 | Domain AdaptationOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Scene-Text Grounding for Text-Based Video Question Answering | Sep 22, 2024 | 2kContrastive Learning | CodeCode Available | 1 |
| Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition | May 9, 2024 | Contrastive LearningScene Text Recognition | CodeCode Available | 1 |
| Efficient scene text image super-resolution with semantic guidance | Mar 20, 2024 | Image Super-ResolutionScene Text Recognition | CodeCode Available | 1 |
| Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition | Feb 21, 2024 | DiversityScene Text Recognition | CodeCode Available | 1 |
| SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable Extractor | Jan 18, 2024 | DecoderScene Text Recognition | CodeCode Available | 1 |
| An Empirical Study of Scaling Law for OCR | Dec 29, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Cross-Lingual Learning in Multilingual Scene Text Recognition | Dec 17, 2023 | Scene Text Recognition | CodeCode Available | 1 |
| Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer | Nov 22, 2023 | DiversityIn-Context Learning | CodeCode Available | 1 |
| Scene Text Image Super-resolution based on Text-conditional Diffusion Models | Nov 16, 2023 | Image GenerationImage Super-Resolution | CodeCode Available | 1 |
| Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation | Oct 25, 2023 | Handwritten Text RecognitionKey Information Extraction | CodeCode Available | 1 |
| Scene Text Recognition Models Explainability Using Local Features | Oct 14, 2023 | PredictionScene Text Recognition | CodeCode Available | 1 |
| Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition | Oct 8, 2023 | Image to textOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text Translation | Aug 6, 2023 | Machine TranslationScene Text Editing | CodeCode Available | 1 |
| Relational Contrastive Learning for Scene Text Recognition | Aug 1, 2023 | Contrastive LearningRepresentation Learning | CodeCode Available | 1 |
| Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement | Jul 19, 2023 | Image Super-ResolutionLEMMA | CodeCode Available | 1 |
| Looking and Listening: Audio Guided Text Recognition | Jun 6, 2023 | DecoderScene Text Recognition | CodeCode Available | 1 |
| MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition | May 24, 2023 | Continual LearningIncremental Learning | CodeCode Available | 1 |
| CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | May 23, 2023 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition | May 9, 2023 | Scene Text Recognition | CodeCode Available | 1 |
| TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition | May 9, 2023 | Optical Character Recognition (OCR)Scene Text Recognition | CodeCode Available | 1 |
| B-Spline Texture Coefficients Estimator for Screen Content Image Super-Resolution | Jan 1, 2023 | Image Super-ResolutionScene Text Recognition | CodeCode Available | 1 |
| ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting | Nov 19, 2022 | BlockingLanguage Modeling | CodeCode Available | 1 |
| Masked Vision-Language Transformers for Scene Text Recognition | Nov 9, 2022 | DecoderScene Text Recognition | CodeCode Available | 1 |
| Self-supervised Character-to-Character Distillation for Text Recognition | Nov 1, 2022 | Data AugmentationRepresentation Learning | CodeCode Available | 1 |
| Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition | Jul 31, 2022 | Scene Text Recognition | CodeCode Available | 1 |
| Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition | Jul 1, 2022 | Contrastive LearningScene Text Recognition | CodeCode Available | 1 |
| Multimodal Semi-Supervised Learning for Text Recognition | May 8, 2022 | Language ModellingRepresentation Learning | CodeCode Available | 1 |
| Pushing the Performance Limit of Scene Text Recognizer without Human Annotation | Apr 16, 2022 | Scene Text Recognition | CodeCode Available | 1 |
| IterVM: Iterative Vision Modeling Module for Scene Text Recognition | Apr 6, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization | Mar 20, 2022 | Common Sense ReasoningContrastive Learning | CodeCode Available | 1 |
| Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching | Mar 13, 2022 | CPUGPU | CodeCode Available | 1 |
| Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement | Mar 9, 2022 | Document EnhancementImage Enhancement | CodeCode Available | 1 |
| Self-supervised Implicit Glyph Attention for Text Recognition | Mar 7, 2022 | Scene Text RecognitionText Segmentation | CodeCode Available | 1 |
| On the Cross-dataset Generalization in License Plate Recognition | Jan 2, 2022 | Data AugmentationLicense Plate Detection | CodeCode Available | 1 |
| Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition | Dec 24, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution | Dec 13, 2021 | Image Super-ResolutionScene Text Recognition | CodeCode Available | 1 |