SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 151175 of 309 papers

TitleStatusHype
DrVideo: Document Retrieval Based Long Video Understanding0
DUBLIN -- Document Understanding By Language-Image Network0
Efficient End-to-End Visual Document Understanding with Rationale Distillation0
Efficient layout-aware pretraining for multimodal form understanding0
Enhancing Question Answering on Charts Through Effective Pre-training Tasks0
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models0
Enumeration of Extractive Oracle Summaries0
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding0
Extract with Order for Coherent Multi-Document Summarization0
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding0
Finding Pragmatic Differences Between Disciplines0
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction0
Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T50
Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation0
Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications0
LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents0
LoPE: Learnable Sinusoidal Positional Encoding for Improving Document Transformer Model0
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding0
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding0
MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary0
MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications0
MATrIX -- Modality-Aware Transformer for Information eXtraction0
Memory-Augmented Agent Training for Business Document Understanding0
Merge and Recognize: A Geometry and 2D Context Aware Graph Model for Named Entity Recognition from Visual Documents0
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework0
Show:102550
← PrevPage 7 of 13Next →

No leaderboard results yet.