SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 201250 of 309 papers

TitleStatusHype
A Survey and Approach to Chart Classification0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document UnderstandingCode0
DocumentNet: Bridging the Data Gap in Document Pre-Training0
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding ModelsCode0
Table Detection for Visually Rich Document ImagesCode0
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding0
Pre-training Meets Clustering: A Hybrid Extractive Multi-document Summarization ModelCode0
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content0
DUBLIN -- Document Understanding By Language-Image Network0
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding0
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding0
DLUE: Benchmarking Document Language Understanding0
M^6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout AnalysisCode0
Two to Five Truths in Non-Negative Matrix Factorization0
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction0
Revisiting Table Detection Datasets for Visually Rich Documents0
Information Redundancy and Biases in Public Document Information Extraction BenchmarksCode0
What Makes a Good Dataset for Symbol Description Reading?0
PDFVQA: A New Dataset for Real-World VQA on PDF Documents0
Is ChatGPT A Good Keyphrase Generator? A Preliminary StudyCode0
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document UnderstandingCode0
Multimodal Tree Decoder for Table of Contents Extraction in Document ImagesCode0
ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information0
VRDU: A Benchmark for Visually-rich Document Understanding0
QueryForm: A Simple Zero-shot Form Entity Query Framework0
Unimodal and Multimodal Representation Training for Relation Extraction0
Transformer-based Approach for Document Understanding0
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document UnderstandingCode0
XDoc: Unified Pre-training for Cross-Format Document UnderstandingCode0
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding0
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text0
Improving Keyphrase Extraction with Data Augmentation and Information Filtering0
DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc0
Understanding Long Documents with Different Position-Aware Attentions0
Knowing Where and What: Unified Word Block Pretraining for Document UnderstandingCode0
Towards Complex Document Understanding By Discrete Reasoning0
DavarOCR: A Toolbox for OCR and Multi-Modal Document UnderstandingCode0
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding0
Test-Time Adaptation for Visual Document Understanding0
RDU: A Region-based Approach to Form-style Document Understanding0
Génération de question à partir d’analyse sémantique pour l’adaptation non supervisée de modèles de compréhension de documents (Question generation from semantic analysis for unsupervised adaptation of document understanding models)0
MATrIX -- Modality-Aware Transformer for Information eXtraction0
MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document UnderstandingCode0
DuReader_vis: A Chinese Dataset for Open-domain Document Visual Question AnsweringCode0
XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding0
Unified Pretraining Framework for Document Understanding0
Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods0
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction0
Hierarchical BERT for Medical Document Understanding0
WebFormer: The Web-page Transformer for Structure Information Extraction0
Show:102550
← PrevPage 5 of 7Next →

No leaderboard results yet.