document understanding

Document understanding involves document classification, layout analysis, information extraction, and DocQA.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 309 papers

Title	Date	Tasks	Status	Hype
Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding	Jul 19, 2024	document understandingInformativeness	CodeCode Available	0
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding	Jul 17, 2024	document understandingOptical Character Recognition (OCR)	CodeCode Available	1
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition	Jul 16, 2024	Decoderdocument understanding	—Unverified	0
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documents	Jul 12, 2024	Document Layout Analysisdocument understanding	CodeCode Available	1
Hypergraph based Understanding for Document Semantic Entity Recognition	Jul 9, 2024	document understanding	CodeCode Available	0
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding	Jul 2, 2024	document understandingKey Information Extraction	CodeCode Available	2
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations	Jul 1, 2024	Benchmarkingdocument understanding	CodeCode Available	2
ColPali: Efficient Document Retrieval with Vision Language Models	Jun 27, 2024	document understandingRAG	CodeCode Available	7
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming	Jun 27, 2024	document understanding	—Unverified	0
DrVideo: Document Retrieval Based Long Video Understanding	Jun 18, 2024	document understandingEgoSchema	—Unverified	0
Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report	Jun 17, 2024	document understanding	CodeCode Available	0
Enhancing Question Answering on Charts Through Effective Pre-training Tasks	Jun 14, 2024	document understandingOptical Character Recognition (OCR)	—Unverified	0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications	Jun 12, 2024	document-image-classificationDocument Image Classification	—Unverified	0
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning	Jun 4, 2024	document understandingGPU	CodeCode Available	1
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use	May 30, 2024	document understandingKey Information Extraction	—Unverified	0
Notes on Applicability of GPT-4 to Document Understanding	May 28, 2024	document understandingOptical Character Recognition (OCR)	—Unverified	0
Focus Anywhere for Fine-grained Multi-page Document Understanding	May 23, 2024	document understandingOptical Character Recognition (OCR)	CodeCode Available	5
Multimodal Adaptive Inference for Document Image Classification with Anytime Early Exiting	May 21, 2024	document-image-classificationDocument Image Classification	CodeCode Available	0
GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding	May 6, 2024	Contrastive Learningdocument understanding	CodeCode Available	0
CREPE: Coordinate-Aware End-to-End Document Parser	May 1, 2024	document understandingOptical Character Recognition (OCR)	—Unverified	0
Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism	Apr 29, 2024	document understandingGPU	CodeCode Available	0
Machine Unlearning for Document Classification	Apr 29, 2024	ClassificationDocument Classification	CodeCode Available	0
A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents	Apr 16, 2024	document understandingKey Information Extraction	—Unverified	0
Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach	Apr 13, 2024	document understanding	—Unverified	0
HRVDA: High-Resolution Visual Document Assistant	Apr 10, 2024	document understanding	—Unverified	0

Show:10 25 50

← PrevPage 5 of 13Next →

No leaderboard results yet.