Privacy-Aware Document Visual Question Answering Dec 15, 2023 document understanding Federated Learning
Code Code Available 1DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction Dec 1, 2023 Optical Character Recognition (OCR)
Code Code Available 1Data Generation for Post-OCR correction of Cyrillic handwriting Nov 27, 2023 Handwriting generation Handwritten Text Recognition
Code Code Available 1FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts Nov 9, 2023 Optical Character Recognition (OCR) Safety Alignment
Code Code Available 1Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Oct 25, 2023 Handwritten Text Recognition Key Information Extraction
Code Code Available 1GenKIE: Robust Generative Multimodal Document Key Information Extraction Oct 24, 2023 Decoder Key Information Extraction
Code Code Available 1DSG: An End-to-End Document Structure Generator Oct 13, 2023 Optical Character Recognition (OCR)
Code Code Available 1UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model Oct 8, 2023 Decoder Language Modeling
Code Code Available 1Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks Oct 8, 2023 Binarization CPU
Code Code Available 1Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition Oct 8, 2023 Image to text Optical Character Recognition (OCR)
Code Code Available 1bbOCR: An Open-source Multi-domain OCR Pipeline for Bengali Documents Aug 21, 2023 distortion correction Optical Character Recognition
Code Code Available 1OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation Aug 8, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Universal Defensive Underpainting Patch: Making Your Text Invisible to Optical Character Recognition Aug 4, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version) Jul 18, 2023 Articles Document AI
Code Code Available 1UTRNet: High-Resolution Urdu Text Recognition In Printed Documents Jun 27, 2023 Line Detection Optical Character Recognition (OCR)
Code Code Available 1GenPlot: Increasing the Scale and Diversity of Chart Derendering Data Jun 20, 2023 Derendering Diversity
Code Code Available 1TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain Jun 3, 2023 Benchmarking Decoder
Code Code Available 1DocFormerv2: Local Features for Document Understanding Jun 2, 2023 Decoder document understanding
Code Code Available 1Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering Jun 1, 2023 Optical Character Recognition (OCR) Question Answering
Code Code Available 1FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions May 28, 2023 Attribute Image Captioning
Code Code Available 1Exploring Better Text Image Translation with Multimodal Codebook May 27, 2023 Machine Translation Optical Character Recognition
Code Code Available 1Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers May 27, 2023 Image Super-Resolution License Plate Recognition
Code Code Available 1MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition May 24, 2023 Continual Learning Incremental Learning
Code Code Available 1XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages May 19, 2023 In-Context Learning Multilingual NLP
Code Code Available 1Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution May 12, 2023 Contrastive Learning Optical Character Recognition (OCR)
Code Code Available 1TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition May 9, 2023 Optical Character Recognition (OCR) Scene Text Recognition
Code Code Available 1DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents Apr 24, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1TagGPT: Large Language Models are Zero-shot Multimodal Taggers Apr 6, 2023 Optical Character Recognition (OCR) Prompt Engineering
Code Code Available 1Efficient OCR for Building a Diverse Digital History Apr 5, 2023 Diversity Image Retrieval
Code Code Available 1ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules Apr 5, 2023 Chart Understanding Derendering
Code Code Available 1Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification Feb 16, 2023 Few-Shot Image Classification Few-Shot Learning
Code Code Available 1A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition Dec 27, 2022 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1SoftCTC -- Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels Dec 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Let's Enhance: A Deep Learning Approach to Extreme Deblurring of Text Images Nov 18, 2022 Deblurring Image Deblurring
Code Code Available 1A Benchmark and Dataset for Post-OCR text correction in Sanskrit Nov 15, 2022 Astronomy Optical Character Recognition (OCR)
Code Code Available 1NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research Nov 15, 2022 Continual Learning Diversity
Code Code Available 1On Web-based Visual Corpus Construction for Visual Document Understanding Nov 7, 2022 document understanding Optical Character Recognition (OCR)
Code Code Available 1Unsupervised Audio-Visual Lecture Segmentation Oct 29, 2022 Navigate Optical Character Recognition (OCR)
Code Code Available 1MCSCSet: A Specialist-annotated Dataset for Medical-domain Chinese Spelling Correction Oct 21, 2022 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1OCR-VQGAN: Taming Text-within-Image Generation Oct 19, 2022 Articles Decoder
Code Code Available 1Task Grouping for Multilingual Text Recognition Oct 13, 2022 Optical Character Recognition (OCR)
Code Code Available 1HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions Sep 18, 2022 object-detection Object Detection
Code Code Available 1Graph Neural Networks and Representation Embedding for Table Extraction in PDF Documents Aug 23, 2022 Optical Character Recognition (OCR) Table Extraction
Code Code Available 1Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild Jul 23, 2022 Optical Character Recognition (OCR)
Code Code Available 1You Actually Look Twice At it (YALTAi): using an object detection approach instead of region segmentation within the Kraken engine Jul 19, 2022 Classification object-detection
Code Code Available 1Detection of Furigana Text in Images Jul 8, 2022 object-detection Object Detection
Code Code Available 1hmBERT: Historical Multilingual Language Models for Named Entity Recognition May 31, 2022 Language Modeling Language Modelling
Code Code Available 1Easter2.0: Improving convolutional models for handwritten text recognition May 30, 2022 Data Augmentation Few-Shot Learning
Code Code Available 1German Parliamentary Corpus (GerParCor) Apr 21, 2022 Optical Character Recognition (OCR)
Code Code Available 1Digitizing Historical Balance Sheet Data: A Practitioner's Guide Mar 31, 2022 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1