SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 1120 of 309 papers

TitleStatusHype
INTERS: Unlocking the Power of Large Language Models in Search with Instruction TuningCode3
Unifying Vision, Text, and Layout for Universal Document ProcessingCode3
OCR-free Document Understanding TransformerCode3
AIN: The Arabic INclusive Large Multimodal ModelCode2
Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown ExtractionCode2
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse SamplingCode2
One missing piece in Vision and Language: A Survey on Comics UnderstandingCode2
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document UnderstandingCode2
MMLongBench-Doc: Benchmarking Long-context Document Understanding with VisualizationsCode2
Visually Guided Generative Text-Layout Pre-training for Document IntelligenceCode2
Show:102550
← PrevPage 2 of 31Next →

No leaderboard results yet.