TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document Mar 7, 2024 document understanding Key Information Extraction
Code Code Available 55 A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding Jul 2, 2024 document understanding Key Information Extraction
Code Code Available 25 LayoutLM: Pre-training of Text and Layout for Document Image Understanding Dec 31, 2019 Document AI document-image-classification
Code Code Available 25 LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding Feb 28, 2022 Document Image Classification document understanding
Code Code Available 25 OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models May 13, 2023 Key Information Extraction Nutrition
Code Code Available 25 PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks Apr 16, 2020 Graph Learning Key Information Extraction
Code Code Available 15 Key Information Extraction From Documents: Evaluation And Generator Jun 9, 2021 Decoder Key Information Extraction
Code Code Available 15 Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks Aug 23, 2022 Document Layout Analysis document understanding
Code Code Available 15 GenKIE: Robust Generative Multimodal Document Key Information Extraction Oct 24, 2023 Decoder Key Information Extraction
Code Code Available 15 ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding Oct 12, 2022 document-image-classification Document Image Classification
Code Code Available 15 KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents May 1, 2024 Diversity Key Information Extraction
Code Code Available 15 LAMBERT: Layout-Aware (Language) Modeling for information extraction Feb 19, 2020 Key Information Extraction Language Modeling
Code Code Available 15 BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Aug 10, 2021 Key Information Extraction Language Modeling
Code Code Available 15 Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction Oct 17, 2023 Entity Linking Key Information Extraction
Code Code Available 15 Form-NLU: Dataset for the Form Natural Language Understanding Apr 4, 2023 4k Form
Code Code Available 15 Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding Sep 29, 2024 document understanding Entity Linking
Code Code Available 15 DocILE Benchmark for Document Information Localization and Extraction Feb 11, 2023 Key Information Extraction Unsupervised Pre-training
Code Code Available 15 Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Oct 25, 2023 Handwritten Text Recognition Key Information Extraction
Code Code Available 15 PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction Jan 7, 2024 Key Information Extraction Key-value Pair Extraction
Code Code Available 15 LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Dec 29, 2020 Document Image Classification Document Layout Analysis
Code Code Available 05 AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification Sep 18, 2023 Classification Key Information Extraction
Code Code Available 05 LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Apr 18, 2022 cross-modal alignment Document AI
Code Code Available 05 MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding Aug 14, 2021 Key Information Extraction named-entity-recognition
Code Code Available 05 Multimodal weighted graph representation for information extraction from visually rich documents. Jan 5, 2024 Document Layout Analysis document understanding
Code Code Available 05 Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations Feb 2, 2024 Key Information Extraction named-entity-recognition
Code Code Available 05 OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition Mar 28, 2024 Decoder document understanding
Code Code Available 05 OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition Jan 1, 2024 Decoder document understanding
Code Code Available 05 OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models Feb 22, 2025 document understanding Key Information Extraction
Code Code Available 05 PaddleOCR 3.0 Technical Report Jul 8, 2025 document understanding Key Information Extraction
Code Code Available 05 PP-StructureV2: A Stronger Document Analysis System Oct 11, 2022 Key Information Extraction Knowledge Distillation
Code Code Available 05 GeoLayoutLM: Geometric Pre-training for Visual Information Extraction Apr 21, 2023 Document AI entity_extraction
Code Code Available 05 GraphRevisedIE: Multimodal Information Extraction with Graph-Revised Network Oct 2, 2024 Key Information Extraction
Code Code Available 05 ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction Mar 18, 2021 Key Information Extraction Optical Character Recognition (OCR)
Code Code Available 05 Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network Sep 11, 2024 Document Layout Analysis document understanding
Code Code Available 05 Spatial Dual-Modality Graph Reasoning for Key Information Extraction Mar 26, 2021 Key Information Extraction Template Matching
Code Code Available 05 Information Redundancy and Biases in Public Document Information Extraction Benchmarks Apr 28, 2023 document understanding Key Information Extraction
Code Code Available 05 Class-Agnostic Region-of-Interest Matching in Document Images Jun 26, 2025 Document Layout Analysis document understanding
Code Code Available 05 DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-Loop Nov 9, 2022 Document AI Key Information Extraction
Code Code Available 05 Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations Jul 1, 2021 Key Information Extraction Optical Character Recognition (OCR)
Code Code Available 05 XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser May 27, 2024 Document AI Form
Code Code Available 05 See then Tell: Enhancing Key Information Extraction with Vision Grounding Sep 29, 2024 Image to text Key Information Extraction
— Unverified 00 A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents Apr 16, 2024 document understanding Key Information Extraction
— Unverified 00 CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy Dec 3, 2024 Hallucination Key Information Extraction
— Unverified 00 Construction of a Syntactic Analysis Map for Yi Shui School through Text Mining and Natural Language Processing Research Feb 16, 2024 graph construction Information Retrieval
— Unverified 00 Data Efficient Training of a U-Net Based Architecture for Structured Documents Localization Oct 2, 2023 Decoder Deep Learning
— Unverified 00 Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review Jul 23, 2024 Deep Learning document understanding
— Unverified 00 DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency Nov 9, 2023 document understanding Key Information Extraction
— Unverified 00 DUBLIN -- Document Understanding By Language-Image Network May 23, 2023 Document Classification document understanding
— Unverified 00 Emergency Communication: OTFS-Based Semantic Transmission with Diffusion Noise Suppression Apr 10, 2025 Denoising Key Information Extraction
— Unverified 00 End-to-End Document Classification and Key Information Extraction using Assignment Optimization Jun 1, 2023 Classification Document Classification
— Unverified 00