| GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification | Sep 11, 2023 | document-image-classificationDocument Image Classification | —Unverified | 0 | 0 |
| Transformer-based Approach for Document Understanding | Oct 16, 2022 | DecoderDocument Layout Analysis | —Unverified | 0 | 0 |
| Two to Five Truths in Non-Negative Matrix Factorization | May 6, 2023 | Clusteringdocument understanding | —Unverified | 0 | 0 |
| Understanding Long Documents with Different Position-Aware Attentions | Aug 17, 2022 | document understandingPosition | —Unverified | 0 | 0 |
| UniDoc: Unified Pretraining Framework for Document Understanding | Dec 1, 2021 | document understandingSelf-Supervised Learning | —Unverified | 0 | 0 |
| Unified Pretraining Framework for Document Understanding | Apr 22, 2022 | Document Layout Analysisdocument understanding | —Unverified | 0 | 0 |
| Unimodal and Multimodal Representation Training for Relation Extraction | Nov 11, 2022 | document understandingRelation | —Unverified | 0 | 0 |
| ViRED: Prediction of Visual Relations in Engineering Drawings | Sep 2, 2024 | Decoderdocument understanding | —Unverified | 0 | 0 |
| WebFormer: The Web-page Transformer for Structure Information Extraction | Feb 1, 2022 | Deep Attentiondocument understanding | —Unverified | 0 | 0 |
| "What is the value of templates?" Rethinking Document Information Extraction Datasets for LLMs | Oct 20, 2024 | document understandingKey Information Extraction | —Unverified | 0 | 0 |
| What Makes a Good Dataset for Symbol Description Reading? | Apr 17, 2023 | document understandingMath | —Unverified | 0 | 0 |
| WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts | Jun 18, 2025 | document understandingMultiple-choice | —Unverified | 0 | 0 |
| Workshop on Document Intelligence Understanding | Jul 31, 2023 | document understandingVisual Question Answering (VQA) | —Unverified | 0 | 0 |
| XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding | May 1, 2022 | document understandingForm | —Unverified | 0 | 0 |
| Deep Learning based Visually Rich Document Content Understanding: A Survey | Aug 2, 2024 | Deep Learningdocument understanding | —Unverified | 0 | 0 |
| Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models | Dec 18, 2024 | Document Classificationdocument-image-classification | —Unverified | 0 | 0 |
| WildDoc: How Far Are We from Achieving Comprehensive and Robust Document Understanding in the Wild? | May 16, 2025 | document understanding | —Unverified | 0 | 0 |
| VRDU: A Benchmark for Visually-rich Document Understanding | Nov 15, 2022 | document understanding | —Unverified | 0 | 0 |
| Acronym Identification and Disambiguation Shared Tasks for Scientific Document Understanding | Dec 22, 2020 | document understanding | —Unverified | 0 | 0 |
| A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents | Apr 16, 2024 | document understandingKey Information Extraction | —Unverified | 0 | 0 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 | 0 |
| Arctic-TILT. Business Document Understanding at Sub-Billion Scale | Aug 8, 2024 | document understandingGPU | —Unverified | 0 | 0 |
| A Retrospective Recount of Computer Architecture Research with a Data-Driven Study of Over Four Decades of ISCA Publications | Jun 22, 2019 | document understandingNatural Language Understanding | —Unverified | 0 | 0 |
| A Simple yet Effective Layout Token in Large Language Models for Document Understanding | Mar 24, 2025 | document understandingPosition | —Unverified | 0 | 0 |
| Assessing Generative AI value in a public sector context: evidence from a field experiment | Feb 13, 2025 | document understanding | —Unverified | 0 | 0 |
| A Survey and Approach to Chart Classification | Jul 9, 2023 | Chart UnderstandingClassification | —Unverified | 0 | 0 |
| A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends | Jul 14, 2025 | document understandingOptical Character Recognition | —Unverified | 0 | 0 |
| A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions | Jun 5, 2025 | Computational Efficiencydocument understanding | —Unverified | 0 | 0 |
| AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21 | Jan 11, 2021 | document understandingUnsupervised Pre-training | —Unverified | 0 | 0 |
| A Token-level Text Image Foundation Model for Document Understanding | Mar 4, 2025 | document understandingVisual Question Answering (VQA) | —Unverified | 0 | 0 |
| Attention-Based Graph Neural Network with Global Context Awareness for Document Understanding | Oct 1, 2020 | document understandinggraph construction | —Unverified | 0 | 0 |
| Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration | Sep 3, 2023 | Decoderdocument understanding | —Unverified | 0 | 0 |
| A User-Centered Concept Mining System for Query and Document Understanding at Tencent | May 21, 2019 | document understandingKnowledge Base Construction | —Unverified | 0 | 0 |
| Auto-encodeurs pour la compr\'ehension de documents parl\'es (Auto-encoders for Spoken Document Understanding) | Jul 1, 2016 | document understanding | —Unverified | 0 | 0 |
| Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer | May 2, 2025 | document understandingHallucination | —Unverified | 0 | 0 |
| Automatic Knowledge Extraction with Human Interface | Apr 9, 2021 | document understanding | —Unverified | 0 | 0 |
| AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content | May 24, 2023 | Document Summarizationdocument understanding | —Unverified | 0 | 0 |
| BERT-AL: BERT for Arbitrarily Long Document Understanding | Jan 1, 2020 | document understandingText Summarization | —Unverified | 0 | 0 |
| BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks | Dec 5, 2024 | Code Generationdocument understanding | —Unverified | 0 | 0 |
| Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding | Jun 27, 2022 | Document Classificationdocument understanding | —Unverified | 0 | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 | 0 |
| BROS: A Pre-trained Language Model for Understanding Texts in Document | Jan 1, 2021 | DecoderDiversity | —Unverified | 0 | 0 |
| BuDDIE: A Business Document Dataset for Multi-task Information Extraction | Apr 5, 2024 | Document Classificationdocument understanding | —Unverified | 0 | 0 |
| Building and better understanding vision-language models: insights and future directions | Aug 22, 2024 | document understanding | —Unverified | 0 | 0 |
| Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology | Nov 30, 2017 | Articlesdocument understanding | —Unverified | 0 | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 | 0 |
| Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning | Feb 26, 2024 | Data Augmentationdocument understanding | —Unverified | 0 | 0 |
| ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information | Nov 29, 2022 | document understandingRetrieval | —Unverified | 0 | 0 |
| CREPE: Coordinate-Aware End-to-End Document Parser | May 1, 2024 | document understandingOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights | Oct 2, 2024 | document understandingDomain Adaptation | —Unverified | 0 | 0 |