SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 110 of 309 papers

TitleStatusHype
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends0
PaddleOCR 3.0 Technical ReportCode0
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement LearningCode7
Class-Agnostic Region-of-Interest Matching in Document ImagesCode0
DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document ImagesCode0
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models0
PP-DocBee2: Improved Baselines with Efficient Data for Multimodal Document UnderstandingCode0
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts0
SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative RefinementCode1
DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning0
Show:102550
← PrevPage 1 of 31Next →

No leaderboard results yet.