| DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks | May 7, 2024 | BinarizationDeblurring | CodeCode Available | 4 |
| Unifying Vision, Text, and Layout for Universal Document Processing | Dec 5, 2022 | Document AIdocument understanding | CodeCode Available | 3 |
| LayoutLM: Pre-training of Text and Layout for Document Image Understanding | Dec 31, 2019 | Document AIdocument-image-classification | CodeCode Available | 2 |
| Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version) | Jul 18, 2023 | ArticlesDocument AI | CodeCode Available | 1 |
| On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning | Jun 17, 2024 | Document AIModel Optimization | CodeCode Available | 1 |
| DiT: Self-supervised Pre-training for Document Image Transformer | Mar 4, 2022 | Document AIdocument-image-classification | CodeCode Available | 1 |
| DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading | Oct 23, 2023 | Document AIdocument understanding | CodeCode Available | 1 |
| ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction | Mar 9, 2023 | Document AIIn-Context Learning | CodeCode Available | 1 |
| Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis | Aug 29, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 1 |
| Document Understanding Dataset and Evaluation (DUDE) | May 15, 2023 | Document AIdocument understanding | CodeCode Available | 1 |
| Document Intelligence Metrics for Visually Rich Document Evaluation | May 23, 2022 | Document AI | CodeCode Available | 1 |
| OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation | Jul 26, 2024 | BenchmarkingDocument AI | CodeCode Available | 1 |
| Context-Aware Chart Element Detection | May 7, 2023 | Data VisualizationDocument AI | CodeCode Available | 1 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 |
| Development of a Legal Document AI-Chatbot | Nov 21, 2023 | ChatbotDocument AI | —Unverified | 0 |
| Document AI: Benchmarks, Models and Applications | Nov 16, 2021 | Deep LearningDocument AI | —Unverified | 0 |
| DoPTA: Improving Document Layout Analysis using Patch-Text Alignment | Dec 17, 2024 | Document AIDocument Image Classification | —Unverified | 0 |
| Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts | Nov 27, 2024 | Document AIDocument Classification | —Unverified | 0 |
| FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction | Mar 16, 2022 | Document AIdocument understanding | —Unverified | 0 |
| H2OVL-Mississippi Vision Language Models Technical Report | Oct 17, 2024 | Document AIVisual Question Answering | —Unverified | 0 |
| ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images | Jun 5, 2023 | Document AIEntity Linking | —Unverified | 0 |
| LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents | Jan 26, 2024 | 4kDocument AI | —Unverified | 0 |
| Model Reporting for Certifiable AI: A Proposal from Merging EU Regulation into AI Development | Jul 21, 2023 | Document AI | —Unverified | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | Dec 1, 2023 | Chart Question AnsweringDocument AI | —Unverified | 0 |
| PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction | Oct 5, 2023 | Document AIFederated Learning | —Unverified | 0 |
| Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents | Mar 23, 2024 | Document AIReading Comprehension | —Unverified | 0 |
| DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-Loop | Nov 9, 2022 | Document AIKey Information Extraction | CodeCode Available | 0 |
| XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser | May 27, 2024 | Document AIForm | CodeCode Available | 0 |
| DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond | Oct 19, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 0 |
| Design of a Quality Management System based on the EU Artificial Intelligence Act | Aug 8, 2024 | Document AIGPU | CodeCode Available | 0 |
| DiMSum: Distributed and Multilingual Summarization of Financial Narratives | Jun 1, 2022 | Document AIDocument Summarization | CodeCode Available | 0 |
| Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing | Jun 1, 2025 | Document AIdocument understanding | CodeCode Available | 0 |
| LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding | Apr 8, 2024 | Document AIdocument understanding | CodeCode Available | 0 |
| GeoLayoutLM: Geometric Pre-training for Visual Information Extraction | Apr 21, 2023 | Document AIentity_extraction | CodeCode Available | 0 |
| LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking | Apr 18, 2022 | cross-modal alignmentDocument AI | CodeCode Available | 0 |
| Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification | Aug 20, 2024 | Document AIDocument Classification | CodeCode Available | 0 |
| Vision Grid Transformer for Document Layout Analysis | Aug 29, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 0 |