| DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks | May 7, 2024 | BinarizationDeblurring | CodeCode Available | 4 |
| Unifying Vision, Text, and Layout for Universal Document Processing | Dec 5, 2022 | Document AIdocument understanding | CodeCode Available | 3 |
| LayoutLM: Pre-training of Text and Layout for Document Image Understanding | Dec 31, 2019 | Document AIdocument-image-classification | CodeCode Available | 2 |
| OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation | Jul 26, 2024 | BenchmarkingDocument AI | CodeCode Available | 1 |
| On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning | Jun 17, 2024 | Document AIModel Optimization | CodeCode Available | 1 |
| DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading | Oct 23, 2023 | Document AIdocument understanding | CodeCode Available | 1 |
| Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis | Aug 29, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 1 |
| Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version) | Jul 18, 2023 | ArticlesDocument AI | CodeCode Available | 1 |
| Document Understanding Dataset and Evaluation (DUDE) | May 15, 2023 | Document AIdocument understanding | CodeCode Available | 1 |
| Context-Aware Chart Element Detection | May 7, 2023 | Data VisualizationDocument AI | CodeCode Available | 1 |
| ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction | Mar 9, 2023 | Document AIIn-Context Learning | CodeCode Available | 1 |
| Document Intelligence Metrics for Visually Rich Document Evaluation | May 23, 2022 | Document AI | CodeCode Available | 1 |
| DiT: Self-supervised Pre-training for Document Image Transformer | Mar 4, 2022 | Document AIdocument-image-classification | CodeCode Available | 1 |
| Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing | Jun 1, 2025 | Document AIdocument understanding | CodeCode Available | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 |
| DoPTA: Improving Document Layout Analysis using Patch-Text Alignment | Dec 17, 2024 | Document AIDocument Image Classification | —Unverified | 0 |
| Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts | Nov 27, 2024 | Document AIDocument Classification | —Unverified | 0 |
| H2OVL-Mississippi Vision Language Models Technical Report | Oct 17, 2024 | Document AIVisual Question Answering | —Unverified | 0 |
| Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification | Aug 20, 2024 | Document AIDocument Classification | CodeCode Available | 0 |
| Design of a Quality Management System based on the EU Artificial Intelligence Act | Aug 8, 2024 | Document AIGPU | CodeCode Available | 0 |
| XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser | May 27, 2024 | Document AIForm | CodeCode Available | 0 |
| LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding | Apr 8, 2024 | Document AIdocument understanding | CodeCode Available | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 |
| Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents | Mar 23, 2024 | Document AIReading Comprehension | —Unverified | 0 |