SOTAVerified

Document AI

Papers

Showing 125 of 40 papers

TitleStatusHype
DocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode4
Unifying Vision, Text, and Layout for Universal Document ProcessingCode3
LayoutLM: Pre-training of Text and Layout for Document Image UnderstandingCode2
OfficeBench: Benchmarking Language Agents across Multiple Applications for Office AutomationCode1
Document Understanding Dataset and Evaluation (DUDE)Code1
DiT: Self-supervised Pre-training for Document Image TransformerCode1
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and ReasoningCode1
DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine ReadingCode1
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information ExtractionCode1
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout AnalysisCode1
Context-Aware Chart Element DetectionCode1
Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version)Code1
Document Intelligence Metrics for Visually Rich Document EvaluationCode1
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document ParsingCode0
LayoutLMv3: Pre-training for Document AI with Unified Text and Image MaskingCode0
Design of a Quality Management System based on the EU Artificial Intelligence ActCode0
Out-of-Distribution Detection with Attention Head Masking for Multimodal Document ClassificationCode0
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and BeyondCode0
Vision Grid Transformer for Document Layout AnalysisCode0
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form ParserCode0
DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-LoopCode0
DiMSum: Distributed and Multilingual Summarization of Financial NarrativesCode0
GeoLayoutLM: Geometric Pre-training for Visual Information ExtractionCode0
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document UnderstandingCode0
H2OVL-Mississippi Vision Language Models Technical Report0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1LayoutLMv3Average F199.21Unverified