document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 309 papers

Title	Date	Tasks	Status
Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer	May 2, 2025	document understandingHallucination	—Unverified
Automatic Knowledge Extraction with Human Interface	Apr 9, 2021	document understanding	—Unverified
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content	May 24, 2023	Document Summarizationdocument understanding	—Unverified
BERT-AL: BERT for Arbitrarily Long Document Understanding	Jan 1, 2020	document understandingText Summarization	—Unverified
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks	Dec 5, 2024	Code Generationdocument understanding	—Unverified
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding	Jun 27, 2022	Document Classificationdocument understanding	—Unverified
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations	Jan 6, 2025	Document AIdocument understanding	—Unverified
BROS: A Pre-trained Language Model for Understanding Texts in Document	Jan 1, 2021	DecoderDiversity	—Unverified
BuDDIE: A Business Document Dataset for Multi-task Information Extraction	Apr 5, 2024	Document Classificationdocument understanding	—Unverified
Building and better understanding vision-language models: insights and future directions	Aug 22, 2024	document understanding	—Unverified
Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology	Nov 30, 2017	Articlesdocument understanding	—Unverified
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence	Mar 27, 2024	Document AIdocument understanding	—Unverified
Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning	Feb 26, 2024	Data Augmentationdocument understanding	—Unverified
ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information	Nov 29, 2022	document understandingRetrieval	—Unverified
CREPE: Coordinate-Aware End-to-End Document Parser	May 1, 2024	document understandingOptical Character Recognition (OCR)	—Unverified
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding	Jul 14, 2022	document understandingOptical Character Recognition (OCR)	—Unverified
DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights	Oct 2, 2024	document understandingDomain Adaptation	—Unverified
Decontextualization: Making Sentences Stand-Alone	Feb 9, 2021	document understandingQuestion Answering	—Unverified
DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc	Aug 17, 2022	document understandingForm	—Unverified
Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review	Jul 23, 2024	Deep Learningdocument understanding	—Unverified
DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning	Jun 5, 2025	document understandingEvent Detection	—Unverified
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications	Jun 12, 2024	document-image-classificationDocument Image Classification	—Unverified
DLUE: Benchmarking Document Language Understanding	May 16, 2023	BenchmarkingDocument Classification	—Unverified
Doc2Im: document to image conversion through self-attentive embedding	Nov 8, 2018	Document To Image Conversiondocument understanding	—Unverified
Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning	May 24, 2025	document understandingVisual Reasoning	—Unverified
DocGraphLM: Documental Graph Language Model for Information Extraction	Jan 5, 2024	document understandingLanguage Modeling	—Unverified
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models	Oct 4, 2024	document understandingKnowledge Distillation	—Unverified
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming	Jun 27, 2024	document understanding	—Unverified
DocLLM: A layout-aware generative language model for multimodal document understanding	Dec 31, 2023	document understandingLanguage Modeling	—Unverified
DocMamba: Efficient Document Pre-training with State Space Model	Sep 18, 2024	document understanding	—Unverified
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding	Nov 20, 2023	document understandingLanguage Modeling	—Unverified
Document Collection Visual Question Answering	Apr 27, 2021	document understandingQuestion Answering	—Unverified
DocumentNet: Bridging the Data Gap in Document Pre-Training	Jun 15, 2023	document understandingEntity Retrieval	—Unverified
Document Image Rectification Bases on Self-Adaptive Multitask Fusion	May 9, 2025	document understanding	—Unverified
Document Layout Analysis with Aesthetic-Guided Image Augmentation	Nov 27, 2021	Document Layout Analysisdocument understanding	—Unverified
Document Understanding for Healthcare Referrals	Sep 22, 2023	document understandingManagement	—Unverified
DocVLM: Make Your VLM an Efficient Reader	Dec 11, 2024	document understandingOptical Character Recognition (OCR)	—Unverified
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond	Oct 19, 2023	Document AIDocument Layout Analysis	—Unverified
DOGE: Towards Versatile Visual Document Grounding and Referring	Nov 26, 2024	document understanding	—Unverified
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency	Nov 9, 2023	document understandingKey Information Extraction	—Unverified
DrVideo: Document Retrieval Based Long Video Understanding	Jun 18, 2024	document understandingEgoSchema	—Unverified
DUBLIN -- Document Understanding By Language-Image Network	May 23, 2023	Document Classificationdocument understanding	—Unverified
Efficient End-to-End Visual Document Understanding with Rationale Distillation	Nov 16, 2023	document understandingImage to text	—Unverified
Efficient layout-aware pretraining for multimodal form understanding	Jan 16, 2022	document understandingForm	—Unverified
Enhancing Question Answering on Charts Through Effective Pre-training Tasks	Jun 14, 2024	document understandingOptical Character Recognition (OCR)	—Unverified
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models	Feb 29, 2024	Contrastive Learningdocument understanding	—Unverified
Enumeration of Extractive Oracle Summaries	Jan 6, 2017	document understandingExtractive Summarization	—Unverified
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding	Sep 18, 2022	Common Sense Reasoningdocument understanding	—Unverified
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling	Dec 6, 2024	document understandingHallucination	—Unverified
Extract with Order for Coherent Multi-Document Summarization	Jun 12, 2017	Document Summarizationdocument understanding	—Unverified

Show:10 25 50

← PrevPage 5 of 7Next →

No leaderboard results yet.