SOTAVerified

Document AI

Papers

Showing 140 of 40 papers

TitleStatusHype
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document ParsingCode0
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding0
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations0
DoPTA: Improving Document Layout Analysis using Patch-Text Alignment0
Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts0
H2OVL-Mississippi Vision Language Models Technical Report0
Out-of-Distribution Detection with Attention Head Masking for Multimodal Document ClassificationCode0
Design of a Quality Management System based on the EU Artificial Intelligence ActCode0
OfficeBench: Benchmarking Language Agents across Multiple Applications for Office AutomationCode1
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and ReasoningCode1
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form ParserCode0
DocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode4
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document UnderstandingCode0
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence0
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents0
LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents0
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts0
Development of a Legal Document AI-Chatbot0
A Multi-Modal Multilingual Benchmark for Document Image Classification0
DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine ReadingCode1
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and BeyondCode0
PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction0
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout AnalysisCode1
Vision Grid Transformer for Document Layout AnalysisCode0
Model Reporting for Certifiable AI: A Proposal from Merging EU Regulation into AI Development0
Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version)Code1
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images0
Document Understanding Dataset and Evaluation (DUDE)Code1
Context-Aware Chart Element DetectionCode1
GeoLayoutLM: Geometric Pre-training for Visual Information ExtractionCode0
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information ExtractionCode1
Unifying Vision, Text, and Layout for Universal Document ProcessingCode3
DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-LoopCode0
DiMSum: Distributed and Multilingual Summarization of Financial NarrativesCode0
Document Intelligence Metrics for Visually Rich Document EvaluationCode1
LayoutLMv3: Pre-training for Document AI with Unified Text and Image MaskingCode0
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction0
DiT: Self-supervised Pre-training for Document Image TransformerCode1
Document AI: Benchmarks, Models and Applications0
LayoutLM: Pre-training of Text and Layout for Document Image UnderstandingCode2
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1LayoutLMv3Average F199.21Unverified