| Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing | Jun 1, 2025 | Document AIdocument understanding | CodeCode Available | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations | Jan 6, 2025 | Document AIdocument understanding | —Unverified | 0 |
| DoPTA: Improving Document Layout Analysis using Patch-Text Alignment | Dec 17, 2024 | Document AIDocument Image Classification | —Unverified | 0 |
| Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts | Nov 27, 2024 | Document AIDocument Classification | —Unverified | 0 |
| H2OVL-Mississippi Vision Language Models Technical Report | Oct 17, 2024 | Document AIVisual Question Answering | —Unverified | 0 |
| Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification | Aug 20, 2024 | Document AIDocument Classification | CodeCode Available | 0 |
| Design of a Quality Management System based on the EU Artificial Intelligence Act | Aug 8, 2024 | Document AIGPU | CodeCode Available | 0 |
| OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation | Jul 26, 2024 | BenchmarkingDocument AI | CodeCode Available | 1 |
| On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning | Jun 17, 2024 | Document AIModel Optimization | CodeCode Available | 1 |
| XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser | May 27, 2024 | Document AIForm | CodeCode Available | 0 |
| DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks | May 7, 2024 | BinarizationDeblurring | CodeCode Available | 4 |
| LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding | Apr 8, 2024 | Document AIdocument understanding | CodeCode Available | 0 |
| Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence | Mar 27, 2024 | Document AIdocument understanding | —Unverified | 0 |
| Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents | Mar 23, 2024 | Document AIReading Comprehension | —Unverified | 0 |
| LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents | Jan 26, 2024 | 4kDocument AI | —Unverified | 0 |
| Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | Dec 1, 2023 | Chart Question AnsweringDocument AI | —Unverified | 0 |
| Development of a Legal Document AI-Chatbot | Nov 21, 2023 | ChatbotDocument AI | —Unverified | 0 |
| A Multi-Modal Multilingual Benchmark for Document Image Classification | Oct 25, 2023 | ClassificationCross-Lingual Transfer | —Unverified | 0 |
| DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading | Oct 23, 2023 | Document AIdocument understanding | CodeCode Available | 1 |
| DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond | Oct 19, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 0 |
| PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction | Oct 5, 2023 | Document AIFederated Learning | —Unverified | 0 |
| Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis | Aug 29, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 1 |
| Vision Grid Transformer for Document Layout Analysis | Aug 29, 2023 | Document AIDocument Layout Analysis | CodeCode Available | 0 |
| Model Reporting for Certifiable AI: A Proposal from Merging EU Regulation into AI Development | Jul 21, 2023 | Document AI | —Unverified | 0 |