| Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer | May 2, 2025 | document understandingHallucination | —Unverified | 0 | 0 |
| Automatic Knowledge Extraction with Human Interface | Apr 9, 2021 | document understanding | —Unverified | 0 | 0 |
| AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content | May 24, 2023 | Document Summarizationdocument understanding | —Unverified | 0 | 0 |
| BERT-AL: BERT for Arbitrarily Long Document Understanding | Jan 1, 2020 | document understandingText Summarization | —Unverified | 0 | 0 |
| BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks | Dec 5, 2024 | Code Generationdocument understanding | —Unverified | 0 | 0 |
| Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding | Jun 27, 2022 | Document Classificationdocument understanding | —Unverified | 0 | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 | 0 |
| BROS: A Pre-trained Language Model for Understanding Texts in Document | Jan 1, 2021 | DecoderDiversity | —Unverified | 0 | 0 |
| BuDDIE: A Business Document Dataset for Multi-task Information Extraction | Apr 5, 2024 | Document Classificationdocument understanding | —Unverified | 0 | 0 |
| Building and better understanding vision-language models: insights and future directions | Aug 22, 2024 | document understanding | —Unverified | 0 | 0 |
| Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology | Nov 30, 2017 | Articlesdocument understanding | —Unverified | 0 | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 | 0 |
| Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning | Feb 26, 2024 | Data Augmentationdocument understanding | —Unverified | 0 | 0 |
| ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information | Nov 29, 2022 | document understandingRetrieval | —Unverified | 0 | 0 |
| CREPE: Coordinate-Aware End-to-End Document Parser | May 1, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding | Jul 14, 2022 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights | Oct 2, 2024 | document understandingDomain Adaptation | —Unverified | 0 | 0 |
| Decontextualization: Making Sentences Stand-Alone | Feb 9, 2021 | document understandingQuestion Answering | —Unverified | 0 | 0 |
| DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc | Aug 17, 2022 | document understandingForm | —Unverified | 0 | 0 |
| Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review | Jul 23, 2024 | Deep Learningdocument understanding | —Unverified | 0 | 0 |
| DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning | Jun 5, 2025 | document understandingEvent Detection | —Unverified | 0 | 0 |
| DistilDoc: Knowledge Distillation for Visually-Rich Document Applications | Jun 12, 2024 | document-image-classificationDocument Image Classification | —Unverified | 0 | 0 |
| DLUE: Benchmarking Document Language Understanding | May 16, 2023 | BenchmarkingDocument Classification | —Unverified | 0 | 0 |
| Doc2Im: document to image conversion through self-attentive embedding | Nov 8, 2018 | Document To Image Conversiondocument understanding | —Unverified | 0 | 0 |
| Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning | May 24, 2025 | document understandingVisual Reasoning | —Unverified | 0 | 0 |
| DocGraphLM: Documental Graph Language Model for Information Extraction | Jan 5, 2024 | document understandingLanguage Modeling | —Unverified | 0 | 0 |
| DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models | Oct 4, 2024 | document understandingKnowledge Distillation | —Unverified | 0 | 0 |
| DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming | Jun 27, 2024 | document understanding | —Unverified | 0 | 0 |
| DocLLM: A layout-aware generative language model for multimodal document understanding | Dec 31, 2023 | document understandingLanguage Modeling | —Unverified | 0 | 0 |
| DocMamba: Efficient Document Pre-training with State Space Model | Sep 18, 2024 | document understanding | —Unverified | 0 | 0 |
| DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding | Nov 20, 2023 | document understandingLanguage Modeling | —Unverified | 0 | 0 |
| Document Collection Visual Question Answering | Apr 27, 2021 | document understandingQuestion Answering | —Unverified | 0 | 0 |
| DocumentNet: Bridging the Data Gap in Document Pre-Training | Jun 15, 2023 | document understandingEntity Retrieval | —Unverified | 0 | 0 |
| Document Image Rectification Bases on Self-Adaptive Multitask Fusion | May 9, 2025 | document understanding | —Unverified | 0 | 0 |
| Document Layout Analysis with Aesthetic-Guided Image Augmentation | Nov 27, 2021 | Document Layout Analysisdocument understanding | —Unverified | 0 | 0 |
| Document Understanding for Healthcare Referrals | Sep 22, 2023 | document understandingManagement | —Unverified | 0 | 0 |
| DocVLM: Make Your VLM an Efficient Reader | Dec 11, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond | Oct 19, 2023 | Document AIDocument Layout Analysis | —Unverified | 0 | 0 |
| DOGE: Towards Versatile Visual Document Grounding and Referring | Nov 26, 2024 | document understanding | —Unverified | 0 | 0 |
| DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency | Nov 9, 2023 | document understandingKey Information Extraction | —Unverified | 0 | 0 |
| DrVideo: Document Retrieval Based Long Video Understanding | Jun 18, 2024 | document understandingEgoSchema | —Unverified | 0 | 0 |
| DUBLIN -- Document Understanding By Language-Image Network | May 23, 2023 | Document Classificationdocument understanding | —Unverified | 0 | 0 |
| Efficient End-to-End Visual Document Understanding with Rationale Distillation | Nov 16, 2023 | document understandingImage to text | —Unverified | 0 | 0 |
| Efficient layout-aware pretraining for multimodal form understanding | Jan 16, 2022 | document understandingForm | —Unverified | 0 | 0 |
| Enhancing Question Answering on Charts Through Effective Pre-training Tasks | Jun 14, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models | Feb 29, 2024 | Contrastive Learningdocument understanding | —Unverified | 0 | 0 |
| Enumeration of Extractive Oracle Summaries | Jan 6, 2017 | document understandingExtractive Summarization | —Unverified | 0 | 0 |
| ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding | Sep 18, 2022 | Common Sense Reasoningdocument understanding | —Unverified | 0 | 0 |
| Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling | Dec 6, 2024 | document understandingHallucination | —Unverified | 0 | 0 |
| Extract with Order for Coherent Multi-Document Summarization | Jun 12, 2017 | Document Summarizationdocument understanding | —Unverified | 0 | 0 |