| A Survey and Approach to Chart Classification | Jul 9, 2023 | Chart UnderstandingClassification | —Unverified | 0 |
| mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding | Jul 4, 2023 | document understandingLanguage Modeling | CodeCode Available | 0 |
| DocumentNet: Bridging the Data Gap in Document Pre-Training | Jun 15, 2023 | document understandingEntity Retrieval | —Unverified | 0 |
| Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models | Jun 5, 2023 | document understandingQuestion Answering | CodeCode Available | 0 |
| Table Detection for Visually Rich Document Images | May 30, 2023 | document understandingobject-detection | CodeCode Available | 0 |
| LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding | May 30, 2023 | document-image-classificationDocument Image Classification | —Unverified | 0 |
| Pre-training Meets Clustering: A Hybrid Extractive Multi-document Summarization Model | May 25, 2023 | ClusteringDocument Summarization | CodeCode Available | 0 |
| AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content | May 24, 2023 | Document Summarizationdocument understanding | —Unverified | 0 |
| DUBLIN -- Document Understanding By Language-Image Network | May 23, 2023 | Document Classificationdocument understanding | —Unverified | 0 |
| Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding | May 19, 2023 | document understanding | —Unverified | 0 |
| Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding | May 16, 2023 | Decoderdocument understanding | —Unverified | 0 |
| DLUE: Benchmarking Document Language Understanding | May 16, 2023 | BenchmarkingDocument Classification | —Unverified | 0 |
| M^6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis | May 15, 2023 | ArticlesDocument Layout Analysis | CodeCode Available | 0 |
| Two to Five Truths in Non-Negative Matrix Factorization | May 6, 2023 | Clusteringdocument understanding | —Unverified | 0 |
| FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction | May 4, 2023 | Contrastive Learningdocument understanding | —Unverified | 0 |
| Revisiting Table Detection Datasets for Visually Rich Documents | May 4, 2023 | document understandingobject-detection | —Unverified | 0 |
| Information Redundancy and Biases in Public Document Information Extraction Benchmarks | Apr 28, 2023 | document understandingKey Information Extraction | CodeCode Available | 0 |
| What Makes a Good Dataset for Symbol Description Reading? | Apr 17, 2023 | document understandingMath | —Unverified | 0 |
| PDFVQA: A New Dataset for Real-World VQA on PDF Documents | Apr 13, 2023 | document understandingKey Information Extraction | —Unverified | 0 |
| Is ChatGPT A Good Keyphrase Generator? A Preliminary Study | Mar 23, 2023 | Diversitydocument understanding | CodeCode Available | 0 |
| Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding | Dec 19, 2022 | Contrastive Learningdocument understanding | CodeCode Available | 0 |
| Multimodal Tree Decoder for Table of Contents Extraction in Document Images | Dec 6, 2022 | Decoderdocument understanding | CodeCode Available | 0 |
| ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information | Nov 29, 2022 | document understandingRetrieval | —Unverified | 0 |
| VRDU: A Benchmark for Visually-rich Document Understanding | Nov 15, 2022 | document understanding | —Unverified | 0 |
| QueryForm: A Simple Zero-shot Form Entity Query Framework | Nov 14, 2022 | document understandingForm | —Unverified | 0 |
| Unimodal and Multimodal Representation Training for Relation Extraction | Nov 11, 2022 | document understandingRelation | —Unverified | 0 |
| Transformer-based Approach for Document Understanding | Oct 16, 2022 | DecoderDocument Layout Analysis | —Unverified | 0 |
| KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding | Oct 8, 2022 | document understandingKnowledge Graphs | CodeCode Available | 0 |
| XDoc: Unified Pre-training for Cross-Format Document Understanding | Oct 6, 2022 | document understandingSemantic entity labeling | CodeCode Available | 0 |
| ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding | Sep 18, 2022 | Common Sense Reasoningdocument understanding | —Unverified | 0 |
| One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text | Sep 12, 2022 | document understandingobject-detection | —Unverified | 0 |
| Improving Keyphrase Extraction with Data Augmentation and Information Filtering | Sep 11, 2022 | Data Augmentationdocument understanding | —Unverified | 0 |
| DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc | Aug 17, 2022 | document understandingForm | —Unverified | 0 |
| Understanding Long Documents with Different Position-Aware Attentions | Aug 17, 2022 | document understandingPosition | —Unverified | 0 |
| Knowing Where and What: Unified Word Block Pretraining for Document Understanding | Jul 28, 2022 | Contrastive Learningdocument understanding | CodeCode Available | 0 |
| Towards Complex Document Understanding By Discrete Reasoning | Jul 25, 2022 | document understandingQuestion Answering | —Unverified | 0 |
| DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding | Jul 14, 2022 | document understandingOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding | Jun 27, 2022 | Document Classificationdocument understanding | —Unverified | 0 |
| Test-Time Adaptation for Visual Document Understanding | Jun 15, 2022 | document understandingDomain Adaptation | —Unverified | 0 |
| RDU: A Region-based Approach to Form-style Document Understanding | Jun 14, 2022 | document understandingForm | —Unverified | 0 |
| Génération de question à partir d’analyse sémantique pour l’adaptation non supervisée de modèles de compréhension de documents (Question generation from semantic analysis for unsupervised adaptation of document understanding models) | Jun 1, 2022 | document understandingQuestion Generation | —Unverified | 0 |
| MATrIX -- Modality-Aware Transformer for Information eXtraction | May 17, 2022 | document understanding | —Unverified | 0 |
| MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding | May 1, 2022 | document understanding | CodeCode Available | 0 |
| DuReader_vis: A Chinese Dataset for Open-domain Document Visual Question Answering | May 1, 2022 | document understandingOpen-Domain Question Answering | CodeCode Available | 0 |
| XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding | May 1, 2022 | document understandingForm | —Unverified | 0 |
| Unified Pretraining Framework for Document Understanding | Apr 22, 2022 | Document Layout Analysisdocument understanding | —Unverified | 0 |
| Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods | Mar 23, 2022 | document understandingLine Detection | —Unverified | 0 |
| FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction | Mar 16, 2022 | Document AIdocument understanding | —Unverified | 0 |
| Hierarchical BERT for Medical Document Understanding | Mar 11, 2022 | document understandingSentence | —Unverified | 0 |
| WebFormer: The Web-page Transformer for Structure Information Extraction | Feb 1, 2022 | Deep Attentiondocument understanding | —Unverified | 0 |