TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document Mar 7, 2024 document understanding Key Information Extraction
Code Code Available 5A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding Jul 2, 2024 document understanding Key Information Extraction
Code Code Available 2OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models May 13, 2023 Key Information Extraction Nutrition
Code Code Available 2LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding Feb 28, 2022 Document Image Classification document understanding
Code Code Available 2LayoutLM: Pre-training of Text and Layout for Document Image Understanding Dec 31, 2019 Document AI document-image-classification
Code Code Available 2Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding Sep 29, 2024 document understanding Entity Linking
Code Code Available 1KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents May 1, 2024 Diversity Key Information Extraction
Code Code Available 1PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction Jan 7, 2024 Key Information Extraction Key-value Pair Extraction
Code Code Available 1Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Oct 25, 2023 Handwritten Text Recognition Key Information Extraction
Code Code Available 1GenKIE: Robust Generative Multimodal Document Key Information Extraction Oct 24, 2023 Decoder Key Information Extraction
Code Code Available 1Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction Oct 17, 2023 Entity Linking Key Information Extraction
Code Code Available 1Form-NLU: Dataset for the Form Natural Language Understanding Apr 4, 2023 4k Form
Code Code Available 1DocILE Benchmark for Document Information Localization and Extraction Feb 11, 2023 Key Information Extraction Unsupervised Pre-training
Code Code Available 1ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding Oct 12, 2022 document-image-classification Document Image Classification
Code Code Available 1Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks Aug 23, 2022 Document Layout Analysis document understanding
Code Code Available 1BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Aug 10, 2021 Key Information Extraction Language Modeling
Code Code Available 1Key Information Extraction From Documents: Evaluation And Generator Jun 9, 2021 Decoder Key Information Extraction
Code Code Available 1PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks Apr 16, 2020 Graph Learning Key Information Extraction
Code Code Available 1LAMBERT: Layout-Aware (Language) Modeling for information extraction Feb 19, 2020 Key Information Extraction Language Modeling
Code Code Available 1PaddleOCR 3.0 Technical Report Jul 8, 2025 document understanding Key Information Extraction
Code Code Available 0Class-Agnostic Region-of-Interest Matching in Document Images Jun 26, 2025 Document Layout Analysis document understanding
Code Code Available 0Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models Apr 27, 2025 Key Information Extraction Natural Language Understanding
— Unverified 0Emergency Communication: OTFS-Based Semantic Transmission with Diffusion Noise Suppression Apr 10, 2025 Denoising Key Information Extraction
— Unverified 0KIEval: Evaluation Metric for Document Key Information Extraction Mar 7, 2025 Key Information Extraction
— Unverified 0OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models Feb 22, 2025 document understanding Key Information Extraction
Code Code Available 0CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy Dec 3, 2024 Hallucination Key Information Extraction
— Unverified 0"What is the value of templates?" Rethinking Document Information Extraction Datasets for LLMs Oct 20, 2024 document understanding Key Information Extraction
— Unverified 0GraphRevisedIE: Multimodal Information Extraction with Graph-Revised Network Oct 2, 2024 Key Information Extraction
Code Code Available 0See then Tell: Enhancing Key Information Extraction with Vision Grounding Sep 29, 2024 Image to text Key Information Extraction
— Unverified 0ViBERTgrid BiLSTM-CRF: Multimodal Key Information Extraction from Unstructured Financial Documents Sep 23, 2024 Key Information Extraction named-entity-recognition
— Unverified 0Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network Sep 11, 2024 Document Layout Analysis document understanding
Code Code Available 0Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review Jul 23, 2024 Deep Learning document understanding
— Unverified 0Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use May 30, 2024 document understanding Key Information Extraction
— Unverified 0XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser May 27, 2024 Document AI Form
Code Code Available 0A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents Apr 16, 2024 document understanding Key Information Extraction
— Unverified 0RealKIE: Five Novel Datasets for Enterprise Key Information Extraction Mar 29, 2024 Key Information Extraction Optical Character Recognition (OCR)
— Unverified 0OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition Mar 28, 2024 Decoder document understanding
Code Code Available 0Construction of a Syntactic Analysis Map for Yi Shui School through Text Mining and Natural Language Processing Research Feb 16, 2024 graph construction Information Retrieval
— Unverified 0LAPDoc: Layout-Aware Prompting for Documents Feb 15, 2024 document understanding Key Information Extraction
— Unverified 0Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations Feb 2, 2024 Key Information Extraction named-entity-recognition
Code Code Available 0UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents Jan 17, 2024 Decoder Form
— Unverified 0Multimodal weighted graph representation for information extraction from visually rich documents. Jan 5, 2024 Document Layout Analysis document understanding
Code Code Available 0OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition Jan 1, 2024 Decoder document understanding
Code Code Available 0DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency Nov 9, 2023 document understanding Key Information Extraction
— Unverified 0VKIE: The Application of Key Information Extraction on Video Text Oct 18, 2023 Key Information Extraction
— Unverified 0PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction Oct 5, 2023 Document AI Federated Learning
— Unverified 0Data Efficient Training of a U-Net Based Architecture for Structured Documents Localization Oct 2, 2023 Decoder Deep Learning
— Unverified 0AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification Sep 18, 2023 Classification Key Information Extraction
Code Code Available 0PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts Jul 20, 2023 Key Information Extraction
— Unverified 0End-to-End Document Classification and Key Information Extraction using Assignment Optimization Jun 1, 2023 Classification Document Classification
— Unverified 0