SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 276300 of 309 papers

TitleStatusHype
LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding0
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding0
Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T50
Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation0
Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications0
LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents0
LoPE: Learnable Sinusoidal Positional Encoding for Improving Document Transformer Model0
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding0
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding0
MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary0
MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications0
MATrIX -- Modality-Aware Transformer for Information eXtraction0
Memory-Augmented Agent Training for Business Document Understanding0
Merge and Recognize: A Geometry and 2D Context Aware Graph Model for Named Entity Recognition from Visual Documents0
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework0
MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding0
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding0
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding0
MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning0
Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web0
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
Notes on Applicability of GPT-4 to Document Understanding0
Object-oriented Neural Programming (OONP) for Document Understanding0
Show:102550
← PrevPage 12 of 13Next →

No leaderboard results yet.