SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 226250 of 309 papers

TitleStatusHype
XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding0
Deep Learning based Visually Rich Document Content Understanding: A Survey0
LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents0
LoPE: Learnable Sinusoidal Positional Encoding for Improving Document Transformer Model0
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding0
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding0
MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary0
MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications0
MATrIX -- Modality-Aware Transformer for Information eXtraction0
Memory-Augmented Agent Training for Business Document Understanding0
Merge and Recognize: A Geometry and 2D Context Aware Graph Model for Named Entity Recognition from Visual Documents0
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework0
MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding0
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding0
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding0
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding0
MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning0
Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web0
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
Notes on Applicability of GPT-4 to Document Understanding0
Object-oriented Neural Programming (OONP) for Document Understanding0
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition0
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition0
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models0
Show:102550
← PrevPage 10 of 13Next →

No leaderboard results yet.