SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 5175 of 309 papers

TitleStatusHype
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document UnderstandingCode1
ARB: A Comprehensive Arabic Multimodal Reasoning BenchmarkCode1
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documentsCode1
Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMsCode1
DocumentCLIP: Linking Figures and Main Body Text in Reflowed DocumentsCode1
DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like DocumentsCode1
Document Understanding Dataset and Evaluation (DUDE)Code1
Value Retrieval with Arbitrary Queries for Form-like DocumentsCode1
LineFormer: Rethinking Line Chart Data Extraction as Instance SegmentationCode1
WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl DataCode1
Docopilot: Improving Multimodal Models for Document-Level UnderstandingCode1
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and LocatingCode1
Multimodal Pre-training Based on Graph Attention Network for Document UnderstandingCode1
SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative RefinementCode1
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural NetworksCode1
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document UnderstandingCode1
A Discrete Variational Recurrent Topic Model without the Reparametrization TrickCode1
DocFormer: End-to-End Transformer for Document UnderstandingCode1
DocFormerv2: Local Features for Document UnderstandingCode1
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document UnderstandingCode0
Learned Compression for Compressed LearningCode0
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document UnderstandingCode0
Knowing Where and What: Unified Word Block Pretraining for Document UnderstandingCode0
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document UnderstandingCode0
Deeper Clinical Document Understanding Using Relation ExtractionCode0
Show:102550
← PrevPage 3 of 13Next →

No leaderboard results yet.