SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 201225 of 309 papers

TitleStatusHype
M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout AnalysisCode1
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document UnderstandingCode0
Multimodal Tree Decoder for Table of Contents Extraction in Document ImagesCode0
Unifying Vision, Text, and Layout for Universal Document ProcessingCode3
ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information0
VRDU: A Benchmark for Visually-rich Document Understanding0
QueryForm: A Simple Zero-shot Form Entity Query Framework0
Unimodal and Multimodal Representation Training for Relation Extraction0
On Web-based Visual Corpus Construction for Visual Document UnderstandingCode1
Transformer-based Approach for Document Understanding0
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document UnderstandingCode1
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document UnderstandingCode0
XDoc: Unified Pre-training for Cross-Format Document UnderstandingCode0
DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like DocumentsCode1
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding0
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text0
Improving Keyphrase Extraction with Data Augmentation and Information Filtering0
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural NetworksCode1
DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc0
Understanding Long Documents with Different Position-Aware Attentions0
Knowing Where and What: Unified Word Block Pretraining for Document UnderstandingCode0
Towards Complex Document Understanding By Discrete Reasoning0
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding0
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding0
Test-Time Adaptation for Visual Document Understanding0
Show:102550
← PrevPage 9 of 13Next →

No leaderboard results yet.