German Parliamentary Corpus (GerParCor) Apr 21, 2022 Optical Character Recognition (OCR)
Code Code Available 1Let's Enhance: A Deep Learning Approach to Extreme Deblurring of Text Images Nov 18, 2022 Deblurring Image Deblurring
Code Code Available 1bbOCR: An Open-source Multi-domain OCR Pipeline for Bengali Documents Aug 21, 2023 distortion correction Optical Character Recognition
Code Code Available 1Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs Mar 15, 2021 Optical Character Recognition (OCR) Synthetic Data Generation
Code Code Available 1GenKIE: Robust Generative Multimodal Document Key Information Extraction Oct 24, 2023 Decoder Key Information Extraction
Code Code Available 1LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images? May 18, 2025 Logical Reasoning Multimodal Reasoning
Code Code Available 1Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval Jul 1, 2020 Optical Character Recognition (OCR) Retrieval
Code Code Available 1FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions May 28, 2023 Attribute Image Captioning
Code Code Available 1GenPlot: Increasing the Scale and Diversity of Chart Derendering Data Jun 20, 2023 Derendering Diversity
Code Code Available 1An Empirical Study of Scaling Law for OCR Dec 29, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval Aug 1, 2024 Attribute Optical Character Recognition
Code Code Available 1MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition May 24, 2023 Continual Learning Incremental Learning
Code Code Available 1Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents Apr 1, 2025 named-entity-recognition Named Entity Recognition
Code Code Available 1Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version) Jul 18, 2023 Articles Document AI
Code Code Available 1From Text to Pixel: Advancing Long-Context Understanding in MLLMs May 23, 2024 Language Modeling Language Modelling
Code Code Available 1Geometry Restoration and Dewarping of Camera-Captured Document Images Jan 6, 2025 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Neural OCR Post-Hoc Correction of Historical Corpora Feb 1, 2021 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark May 22, 2025 document understanding Multimodal Reasoning
Code Code Available 1Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Oct 25, 2023 Handwritten Text Recognition Key Information Extraction
Code Code Available 1Exploring Cross-Image Pixel Contrast for Semantic Segmentation Jan 28, 2021 Metric Learning Optical Character Recognition (OCR)
Code Code Available 1FAWA: Fast Adversarial Watermark Attack on Optical Character Recognition (OCR) Systems Dec 15, 2020 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks Dec 31, 2018 Handwriting Recognition License Plate Recognition
Code Code Available 1OCR-VQGAN: Taming Text-within-Image Generation Oct 19, 2022 Articles Decoder
Code Code Available 1ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting Mar 1, 2024 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images Dec 3, 2021 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Nov 16, 2024 Mixture-of-Experts Optical Character Recognition (OCR)
Code Code Available 1FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts Nov 9, 2023 Optical Character Recognition (OCR) Safety Alignment
Code Code Available 1End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net Jun 2, 2021 Optical Character Recognition (OCR)
Code Code Available 1Easter2.0: Improving convolutional models for handwritten text recognition May 30, 2022 Data Augmentation Few-Shot Learning
Code Code Available 1Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach Aug 27, 2024 License Plate Recognition Optical Character Recognition
Code Code Available 1Attack of the Tails: Yes, You Really Can Backdoor Federated Learning Jul 9, 2020 Fairness Federated Learning
Code Code Available 1AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions Apr 27, 2021 Optical Character Recognition (OCR)
Code Code Available 1EAST: An Efficient and Accurate Scene Text Detector Apr 11, 2017 Curved Text Detection Optical Character Recognition (OCR)
Code Code Available 1Exploring Better Text Image Translation with Multimodal Codebook May 27, 2023 Machine Translation Optical Character Recognition
Code Code Available 1DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction Dec 1, 2023 Optical Character Recognition (OCR)
Code Code Available 1DocFormerv2: Local Features for Document Understanding Jun 2, 2023 Decoder document understanding
Code Code Available 1Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders Jun 10, 2020 Cell Segmentation Denoising
Code Code Available 1DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding Aug 27, 2024 document understanding Optical Character Recognition (OCR)
Code Code Available 1DocScanner: Robust Document Image Rectification with Progressive Learning Oct 28, 2021 Optical Character Recognition (OCR)
Code Code Available 1Detection of Furigana Text in Images Jul 8, 2022 object-detection Object Detection
Code Code Available 1DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding Jan 1, 2025 document understanding Optical Character Recognition (OCR)
Code Code Available 1DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents Apr 24, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1A Multiplexed Network for End-to-End, Multilingual OCR Mar 29, 2021 Optical Character Recognition (OCR) Text Detection
Code Code Available 1Digitizing Historical Balance Sheet Data: A Practitioner's Guide Mar 31, 2022 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement Oct 17, 2020 Binarization Deblurring
Code Code Available 1DSG: An End-to-End Document Structure Generator Oct 13, 2023 Optical Character Recognition (OCR)
Code Code Available 1DiT: Self-supervised Pre-training for Document Image Transformer Mar 4, 2022 Document AI document-image-classification
Code Code Available 1A Two-Step Approach for Automatic OCR Post-Correction Dec 1, 2020 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Efficient OCR for Building a Diverse Digital History Apr 5, 2023 Diversity Image Retrieval
Code Code Available 1DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction Oct 25, 2021 Optical Character Recognition (OCR)
Code Code Available 1