SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 201250 of 309 papers

TitleStatusHype
Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer0
Automatic Knowledge Extraction with Human Interface0
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content0
BERT-AL: BERT for Arbitrarily Long Document Understanding0
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks0
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding0
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations0
BROS: A Pre-trained Language Model for Understanding Texts in Document0
BuDDIE: A Business Document Dataset for Multi-task Information Extraction0
Building and better understanding vision-language models: insights and future directions0
Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology0
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence0
Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning0
ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information0
CREPE: Coordinate-Aware End-to-End Document Parser0
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding0
DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights0
Decontextualization: Making Sentences Stand-Alone0
DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc0
Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review0
DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications0
DLUE: Benchmarking Document Language Understanding0
Doc2Im: document to image conversion through self-attentive embedding0
Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning0
DocGraphLM: Documental Graph Language Model for Information Extraction0
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models0
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming0
DocLLM: A layout-aware generative language model for multimodal document understanding0
DocMamba: Efficient Document Pre-training with State Space Model0
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding0
Document Collection Visual Question Answering0
DocumentNet: Bridging the Data Gap in Document Pre-Training0
Document Image Rectification Bases on Self-Adaptive Multitask Fusion0
Document Layout Analysis with Aesthetic-Guided Image Augmentation0
Document Understanding for Healthcare Referrals0
DocVLM: Make Your VLM an Efficient Reader0
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond0
DOGE: Towards Versatile Visual Document Grounding and Referring0
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency0
DrVideo: Document Retrieval Based Long Video Understanding0
DUBLIN -- Document Understanding By Language-Image Network0
Efficient End-to-End Visual Document Understanding with Rationale Distillation0
Efficient layout-aware pretraining for multimodal form understanding0
Enhancing Question Answering on Charts Through Effective Pre-training Tasks0
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models0
Enumeration of Extractive Oracle Summaries0
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding0
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling0
Extract with Order for Coherent Multi-Document Summarization0
Show:102550
← PrevPage 5 of 7Next →

No leaderboard results yet.