| Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks | Oct 8, 2023 | BinarizationCPU | CodeCode Available | 1 | 5 |
| Geometry Restoration and Dewarping of Camera-Captured Document Images | Jan 6, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images | Apr 29, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition | Oct 7, 2021 | Label Error DetectionOptical Character Recognition | CodeCode Available | 1 | 5 |
| T-MARS: Improving Visual Representations by Circumventing Text Feature Learning | Jul 6, 2023 | Optical Character Recognition | CodeCode Available | 1 | 5 |
| On the Cross-dataset Generalization in License Plate Recognition | Jan 2, 2022 | Data AugmentationLicense Plate Detection | CodeCode Available | 1 | 5 |
| FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents | May 27, 2019 | FormOptical Character Recognition | CodeCode Available | 1 | 5 |
| Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval | Aug 1, 2024 | AttributeOptical Character Recognition | CodeCode Available | 1 | 5 |
| bbOCR: An Open-source Multi-domain OCR Pipeline for Bengali Documents | Aug 21, 2023 | distortion correctionOptical Character Recognition | CodeCode Available | 1 | 5 |
| Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter | Jun 10, 2021 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Iranis: A Large-scale Dataset of Farsi License Plate Characters | Jan 1, 2021 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers | May 27, 2023 | Image Super-ResolutionLicense Plate Recognition | CodeCode Available | 1 | 5 |
| ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting | Mar 1, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset | Jun 6, 2024 | object-detectionObject Detection | CodeCode Available | 1 | 5 |
| Data Generation for Post-OCR correction of Cyrillic handwriting | Nov 27, 2023 | Handwriting generationHandwritten Text Recognition | CodeCode Available | 1 | 5 |
| A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition | Dec 27, 2022 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model | Apr 19, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations | May 23, 2021 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents | Apr 1, 2025 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 | 5 |
| An Empirical Study of Scaling Law for OCR | Dec 29, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| A Two-Step Approach for Automatic OCR Post-Correction | Dec 1, 2020 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes | Oct 1, 2020 | Multi-Label ClassificationOptical Character Recognition | CodeCode Available | 1 | 5 |
| BankNote-Net: Open dataset for assistive universal currency recognition | Apr 7, 2022 | Contrastive LearningFew-Shot Learning | CodeCode Available | 1 | 5 |
| Confidence-aware Non-repetitive Multimodal Transformers for TextCaps | Dec 7, 2020 | Image CaptioningOptical Character Recognition | CodeCode Available | 1 | 5 |
| Neural OCR Post-Hoc Correction of Historical Corpora | Feb 1, 2021 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| MCSCSet: A Specialist-annotated Dataset for Medical-domain Chinese Spelling Correction | Oct 21, 2022 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images? | May 18, 2025 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 1 | 5 |
| OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation | Aug 8, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Detection of Furigana Text in Images | Jul 8, 2022 | object-detectionObject Detection | CodeCode Available | 1 | 5 |
| Combining Morphological and Histogram based Text Line Segmentation in the OCR Context | Mar 16, 2021 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders | Jun 10, 2020 | Cell SegmentationDenoising | CodeCode Available | 1 | 5 |
| DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents | Apr 24, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR documents | Aug 6, 2021 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 | 5 |
| Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification | Feb 16, 2023 | Few-Shot Image ClassificationFew-Shot Learning | CodeCode Available | 1 | 5 |
| PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents | Mar 23, 2024 | ArticlesOptical Character Recognition | CodeCode Available | 1 | 5 |
| Boosting on the shoulders of giants in quantum device calibration | May 13, 2020 | BIG-bench Machine LearningFew-Shot Learning | CodeCode Available | 1 | 5 |
| Toxicity of the Commons: Curating Open-Source Pre-Training Data | Oct 29, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 | 5 |
| Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach | Aug 27, 2024 | License Plate RecognitionOptical Character Recognition | CodeCode Available | 1 | 5 |
| It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports | Jan 22, 2021 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 | 5 |
| Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription | Feb 27, 2025 | Handwritten Text RecognitionHTR | CodeCode Available | 0 | 5 |
| A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision | Mar 30, 2023 | DecoderMulti-Task Learning | CodeCode Available | 0 | 5 |
| ASTER: An Attentional Scene Text Recognizer with Flexible Rectification | Jun 25, 2018 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 | 5 |
| A Skip-connected Multi-column Network for Isolated Handwritten Bangla Character and Digit recognition | Apr 27, 2020 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 | 5 |
| iExam: A Novel Online Exam Monitoring and Analysis System Based on Face Detection and Recognition | Jun 27, 2022 | Face DetectionFace Recognition | CodeCode Available | 0 | 5 |
| High-Throughput Phenotyping using Computer Vision and Machine Learning | Jul 8, 2024 | Image SegmentationOptical Character Recognition | CodeCode Available | 0 | 5 |
| IDPL-PFOD2: A New Large-Scale Dataset for Printed Farsi Optical Character Recognition | Dec 2, 2023 | Optical Character RecognitionPrinted Text Recognition | CodeCode Available | 0 | 5 |
| Arrow-Guided VLM: Enhancing Flowchart Understanding via Arrow Direction Encoding | May 9, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 | 5 |
| Are VLMs Really Blind | Oct 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Advancing Multilingual Handwritten Numeral Recognition with Attention-driven Transfer Learning | Mar 18, 2024 | Handwritten Digit RecognitionOptical Character Recognition | CodeCode Available | 0 | 5 |
| A model of diffuse Galactic Radio Emission from 10 MHz to 100 GHz | Feb 12, 2008 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 | 5 |