SOTAVerified

document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Showing 201250 of 309 papers

TitleStatusHype
Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation0
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models0
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding0
Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI0
SimCLAD: A Simple Framework for Contrastive Learning of Acronym Disambiguation0
SLJP: Semantic Extraction based Legal Judgment Prediction0
StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training0
Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends0
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding0
Table-Of-Contents generation on contemporary documents0
Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks0
Test-Time Adaptation for Visual Document Understanding0
The Hidden Structure -- Improving Legal Document Understanding Through Explicit Text Formatting0
The Law of Large Documents: Understanding the Structure of Legal Contracts Using Visual Cues0
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts0
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection0
Towards Complex Document Understanding By Discrete Reasoning0
Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach0
Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark0
GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification0
Transformer-based Approach for Document Understanding0
Two to Five Truths in Non-Negative Matrix Factorization0
Understanding Long Documents with Different Position-Aware Attentions0
UniDoc: Unified Pretraining Framework for Document Understanding0
Unified Pretraining Framework for Document Understanding0
Unimodal and Multimodal Representation Training for Relation Extraction0
ViRED: Prediction of Visual Relations in Engineering Drawings0
WebFormer: The Web-page Transformer for Structure Information Extraction0
"What is the value of templates?" Rethinking Document Information Extraction Datasets for LLMs0
What Makes a Good Dataset for Symbol Description Reading?0
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts0
Workshop on Document Intelligence Understanding0
XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding0
Deep Learning based Visually Rich Document Content Understanding: A Survey0
DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document ImagesCode0
Table Detection for Visually Rich Document ImagesCode0
Class-Agnostic Region-of-Interest Matching in Document ImagesCode0
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document UnderstandingCode0
Learned Compression for Compressed LearningCode0
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document UnderstandingCode0
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document UnderstandingCode0
DocMIA: Document-Level Membership Inference Attacks against DocVQA ModelsCode0
Understood in Translation, Transformers for Domain UnderstandingCode0
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document UnderstandingCode0
Knowing Where and What: Unified Word Block Pretraining for Document UnderstandingCode0
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document UnderstandingCode0
Is ChatGPT A Good Keyphrase Generator? A Preliminary StudyCode0
Information Redundancy and Biases in Public Document Information Extraction BenchmarksCode0
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding ModelsCode0
Long-Range Transformer Architectures for Document UnderstandingCode0
Show:102550
← PrevPage 5 of 7Next →

No leaderboard results yet.