SOTAVerified

Document Image Classification

Document image classification is the task of classifying documents based on images of their contents.

( Image credit: Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines )

Papers

Showing 125 of 50 papers

TitleStatusHype
OCR-free Document Understanding TransformerCode3
LayoutLM: Pre-training of Text and Layout for Document Image UnderstandingCode2
BEiT: BERT Pre-Training of Image TransformersCode2
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document UnderstandingCode2
Revisiting ResNets: Improved Training and Scaling StrategiesCode1
DocXClassifier: High Performance Explainable Deep Network for Document Image ClassificationCode1
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout TransformerCode1
Improving accuracy and speeding up Document Image Classification through parallel systemsCode1
Training data-efficient image transformers & distillation through attentionCode1
Multimodal Side-Tuning for Document ClassificationCode1
DiT: Self-supervised Pre-training for Document Image TransformerCode1
DocFormer: End-to-End Transformer for Document UnderstandingCode1
RoBERTa: A Robustly Optimized BERT Pretraining ApproachCode1
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document UnderstandingCode1
LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding0
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding0
Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines0
Analysis of Convolutional Neural Networks for Document Image Classification0
CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification0
Context-Aware Classification of Legal Document Pages0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications0
Document AI: Benchmarks, Models and Applications0
Document image classification, with a specific view on applications of patent images0
DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification0
Domain Agnostic Few-Shot Learning For Document Intelligence0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EAMLAccuracy97.7Unverified
2Cross-ModalAccuracy97.05Unverified
3DocFormerBASEAccuracy96.17Unverified
4LayoutLMV3LargeAccuracy95.93Unverified
5LiLT[EN-R]BASEAccuracy95.68Unverified
6LayoutLMv2LARGEAccuracy95.64Unverified
7TILT-LargeAccuracy95.52Unverified
8DocFormer largeAccuracy95.5Unverified
9LayoutLMv3BASEAccuracy95.44Unverified
10DonutAccuracy95.3Unverified
#ModelMetricClaimedVerifiedStatus
1DocXClassifier-LAccuracy95.57Unverified
2DocBert [DOCBERT]Accuracy91.95Unverified
3Eff-GNN + Word2Vec [word2vec]Accuracy91Unverified
4Multimodal Side-Tuning (MobileNetV2)Accuracy90.5Unverified
5Multimodal Side-Tuning (ResNet50)Accuracy90.3Unverified
6DocBERT [DOCBERT]Accuracy82.3Unverified
7BERT [BERT]Accuracy79Unverified
8Eff-GNN + Word2Vec [word2vec] + Image EmbeddingAccuracy77.5Unverified
9Eff-GNN+ Word2Vec [word2vec]Accuracy73.5Unverified
10VGGMemory7.08Unverified
#ModelMetricClaimedVerifiedStatus
1PCGAN-CHARAccuracy89.54Unverified
2Pixel-level RCAccuracy77.22Unverified
#ModelMetricClaimedVerifiedStatus
1PCGAN-CHARAccuracy96.68Unverified
2Pixel-level RCAccuracy95.46Unverified
#ModelMetricClaimedVerifiedStatus
1ResNet-RS (ResNet-200 + RS training tricks)Top 1 Accuracy - Verb83.4Unverified
#ModelMetricClaimedVerifiedStatus
1Pixel-level RCAccuracy97.62Unverified
#ModelMetricClaimedVerifiedStatus
1PCGAN-CHARAccuracy98.43Unverified
#ModelMetricClaimedVerifiedStatus
1CNNAccuracy86Unverified