| LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding | Nov 2, 2024 | document understandingQuestion Answering | —Unverified | 0 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 |
| Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation | Jul 1, 2021 | document understandingWord Embeddings | —Unverified | 0 |
| DOGE: Towards Versatile Visual Document Grounding and Referring | Nov 26, 2024 | document understanding | —Unverified | 0 |
| A Token-level Text Image Foundation Model for Document Understanding | Mar 4, 2025 | document understandingVisual Question Answering (VQA) | —Unverified | 0 |
| Enumeration of Extractive Oracle Summaries | Jan 6, 2017 | document understandingExtractive Summarization | —Unverified | 0 |
| A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents | Apr 16, 2024 | document understandingKey Information Extraction | —Unverified | 0 |
| Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications | Sep 27, 2024 | DiversityDocument Summarization | —Unverified | 0 |
| DocVLM: Make Your VLM an Efficient Reader | Dec 11, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 |
| A User-Centered Concept Mining System for Query and Document Understanding at Tencent | May 21, 2019 | document understandingKnowledge Base Construction | —Unverified | 0 |
| Document Understanding for Healthcare Referrals | Sep 22, 2023 | document understandingManagement | —Unverified | 0 |
| Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning | Feb 26, 2024 | Data Augmentationdocument understanding | —Unverified | 0 |
| DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights | Oct 2, 2024 | document understandingDomain Adaptation | —Unverified | 0 |
| Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding | May 19, 2023 | document understanding | —Unverified | 0 |
| Decontextualization: Making Sentences Stand-Alone | Feb 9, 2021 | document understandingQuestion Answering | —Unverified | 0 |
| Finding Pragmatic Differences Between Disciplines | Sep 30, 2023 | DiversityDocument Summarization | —Unverified | 0 |
| Document Layout Analysis with Aesthetic-Guided Image Augmentation | Nov 27, 2021 | Document Layout Analysisdocument understanding | —Unverified | 0 |
| FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction | Mar 16, 2022 | Document AIdocument understanding | —Unverified | 0 |
| FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction | May 4, 2023 | Contrastive Learningdocument understanding | —Unverified | 0 |
| AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21 | Jan 11, 2021 | document understandingUnsupervised Pre-training | —Unverified | 0 |
| Friendly Topic Assistant for Transformer Based Abstractive Summarization | Nov 1, 2020 | Abstractive Text SummarizationDocument Summarization | —Unverified | 0 |
| From Entity Linking to Question Answering -- Recent Progress on Semantic Grounding Tasks | Dec 1, 2016 | document understandingEntity Linking | —Unverified | 0 |
| Document Image Rectification Bases on Self-Adaptive Multitask Fusion | May 9, 2025 | document understanding | —Unverified | 0 |
| Génération de question à partir d’analyse sémantique pour l’adaptation non supervisée de modèles de compréhension de documents (Question generation from semantic analysis for unsupervised adaptation of document understanding models) | Jun 1, 2022 | document understandingQuestion Generation | —Unverified | 0 |
| DocumentNet: Bridging the Data Gap in Document Pre-Training | Jun 15, 2023 | document understandingEntity Retrieval | —Unverified | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 |
| Document Collection Visual Question Answering | Apr 27, 2021 | document understandingQuestion Answering | —Unverified | 0 |
| A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions | Jun 5, 2025 | Computational Efficiencydocument understanding | —Unverified | 0 |
| Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology | Nov 30, 2017 | Articlesdocument understanding | —Unverified | 0 |
| DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding | Nov 20, 2023 | document understandingLanguage Modeling | —Unverified | 0 |
| Building and better understanding vision-language models: insights and future directions | Aug 22, 2024 | document understanding | —Unverified | 0 |
| A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends | Jul 14, 2025 | document understandingOptical Character Recognition | —Unverified | 0 |
| BuDDIE: A Business Document Dataset for Multi-task Information Extraction | Apr 5, 2024 | Document Classificationdocument understanding | —Unverified | 0 |
| DocMamba: Efficient Document Pre-training with State Space Model | Sep 18, 2024 | document understanding | —Unverified | 0 |
| A Survey and Approach to Chart Classification | Jul 9, 2023 | Chart UnderstandingClassification | —Unverified | 0 |
| DocLLM: A layout-aware generative language model for multimodal document understanding | Dec 31, 2023 | document understandingLanguage Modeling | —Unverified | 0 |
| BROS: A Pre-trained Language Model for Understanding Texts in Document | Jan 1, 2021 | DecoderDiversity | —Unverified | 0 |
| LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding | May 30, 2023 | document-image-classificationDocument Image Classification | —Unverified | 0 |
| Is Cognition consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding | Nov 12, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 |
| LAPDoc: Layout-Aware Prompting for Documents | Feb 15, 2024 | document understandingKey Information Extraction | —Unverified | 0 |
| DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming | Jun 27, 2024 | document understanding | —Unverified | 0 |
| A Simple yet Effective Layout Token in Large Language Models for Document Understanding | Mar 24, 2025 | document understandingPosition | —Unverified | 0 |
| Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation | Nov 22, 2024 | Anomaly Detectiondocument understanding | —Unverified | 0 |
| DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models | Oct 4, 2024 | document understandingKnowledge Distillation | —Unverified | 0 |
| LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding | Apr 16, 2021 | document understanding | —Unverified | 0 |
| LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding | Mar 21, 2024 | document-image-classificationDocument Image Classification | —Unverified | 0 |
| DocGraphLM: Documental Graph Language Model for Information Extraction | Jan 5, 2024 | document understandingLanguage Modeling | —Unverified | 0 |
| Improving Keyphrase Extraction with Data Augmentation and Information Filtering | Sep 11, 2022 | Data Augmentationdocument understanding | —Unverified | 0 |
| Joint Structured Learning and Predictions under Logical Constraints in Conditional Random Fields | Aug 25, 2017 | BIG-bench Machine Learningdocument understanding | —Unverified | 0 |