SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 101150 of 309 papers

TitleStatusHype
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding0
A Multi-Modal Multilingual Benchmark for Document Image Classification0
Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation0
DOGE: Towards Versatile Visual Document Grounding and Referring0
A Token-level Text Image Foundation Model for Document Understanding0
Enumeration of Extractive Oracle Summaries0
A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents0
Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications0
DocVLM: Make Your VLM an Efficient Reader0
A User-Centered Concept Mining System for Query and Document Understanding at Tencent0
Document Understanding for Healthcare Referrals0
Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning0
DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights0
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding0
Decontextualization: Making Sentences Stand-Alone0
Finding Pragmatic Differences Between Disciplines0
Document Layout Analysis with Aesthetic-Guided Image Augmentation0
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction0
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction0
AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-210
Friendly Topic Assistant for Transformer Based Abstractive Summarization0
From Entity Linking to Question Answering -- Recent Progress on Semantic Grounding Tasks0
Document Image Rectification Bases on Self-Adaptive Multitask Fusion0
Génération de question à partir d’analyse sémantique pour l’adaptation non supervisée de modèles de compréhension de documents (Question generation from semantic analysis for unsupervised adaptation of document understanding models)0
DocumentNet: Bridging the Data Gap in Document Pre-Training0
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence0
Document Collection Visual Question Answering0
A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions0
Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology0
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding0
Building and better understanding vision-language models: insights and future directions0
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends0
BuDDIE: A Business Document Dataset for Multi-task Information Extraction0
DocMamba: Efficient Document Pre-training with State Space Model0
A Survey and Approach to Chart Classification0
DocLLM: A layout-aware generative language model for multimodal document understanding0
BROS: A Pre-trained Language Model for Understanding Texts in Document0
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding0
Is Cognition consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding0
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations0
LAPDoc: Layout-Aware Prompting for Documents0
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming0
A Simple yet Effective Layout Token in Large Language Models for Document Understanding0
Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation0
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models0
LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding0
LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding0
DocGraphLM: Documental Graph Language Model for Information Extraction0
Improving Keyphrase Extraction with Data Augmentation and Information Filtering0
Joint Structured Learning and Predictions under Logical Constraints in Conditional Random Fields0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.