SOTAVerified

Document Layout Analysis

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Papers

Showing 5199 of 99 papers

TitleStatusHype
DLAFormer: An End-to-End Transformer For Document Layout Analysis0
DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts0
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning0
Document AI: Benchmarks, Models and Applications0
Document Domain Randomization for Deep Learning Document Layout Extraction0
Document Image Layout Analysis via Explicit Edge Embedding Network0
Document Layout Analysis on BaDLAD Dataset: A Comprehensive MViTv2 Based Approach0
Document Layout Analysis via Dynamic Residual Feature Fusion0
Document Layout Analysis with Aesthetic-Guided Image Augmentation0
DoPTA: Improving Document Layout Analysis using Patch-Text Alignment0
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation0
Efficient few-shot learning for pixel-precise handwritten document layout analysis0
Evaluation of a Region Proposal Architecture for Multi-task Document Layout Analysis0
Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks0
Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents0
Neural Graph Matching for Modification Similarity Applied to Electronic Document Comparison0
Object Recognition from Scientific Document based on Compartment Refinement Framework0
Parameter-free Geometric Document Layout Analysis0
Performance Enhancement Leveraging Mask-RCNN on Bengali Document Layout Analysis0
SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation0
The YOLO model that still excels in document layout analysis0
Towards Unified Multi-granularity Text Detection with Interactive Attention0
Transformer-based Approach for Document Understanding0
U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts0
Unified Pretraining Framework for Document Understanding0
UnSupDLA: Towards Unsupervised Document Layout Analysis0
Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks0
Visual Detection with Context for Document Layout Analysis0
VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach0
Document Layout Annotation: Database and Benchmark in the Domain of Public AffairsCode0
A Graphical Approach to Document Layout AnalysisCode0
dhSegment: A generic deep-learning approach for document segmentationCode0
SFDLA: Source-Free Document Layout AnalysisCode0
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document UnderstandingCode0
LayoutLMv3: Pre-training for Document AI with Unified Text and Image MaskingCode0
LayoutReader: Pre-training of Text and Layout for Reading Order DetectionCode0
M^6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout AnalysisCode0
Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural NetworkCode0
Multimodal weighted graph representation for information extraction from visually rich documents.Code0
Text Role Classification in Scientific Charts Using Multimodal TransformersCode0
Multi-Task Handwritten Document Layout AnalysisCode0
Vision Grid Transformer for Document Layout AnalysisCode0
DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense UnderstandingCode0
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis DatasetCode0
ICDAR 2021 Competition on Historical Map SegmentationCode0
Class-Agnostic Region-of-Interest Matching in Document ImagesCode0
PdfTable: A Unified Toolkit for Deep Learning-Based Table ExtractionCode0
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and RelationsCode0
DocXChain: A Powerful Open-Source Toolchain for Document Parsing and BeyondCode0
Show:102550
← PrevPage 2 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CDeC-NetTable0.98Unverified
2VGTOverall0.96Unverified
3TRDLUOverall0.96Unverified
4DETROverall0.96Unverified
5VSROverall0.96Unverified
6LayoutLMv3-BOverall0.95Unverified
7DiT-LOverall0.95Unverified
8DoPTAOverall0.95Unverified
9UDocOverall0.94Unverified
10ResNext-101-32×8dOverall0.94Unverified
#ModelMetricClaimedVerifiedStatus
1CV-GroupClass Average IoU83.4Unverified
2CNKIClass Average IoU77.8Unverified
3VAI-OCRClass Average IoU70.7Unverified
4DeepLabV3+Class Average IoU66.5Unverified
5L3i++Class Average IoU (Few-shot setting)61.1Unverified
#ModelMetricClaimedVerifiedStatus
1DoPTA mAP70.72Unverified
2DocLayout-YOLO mAP70.3Unverified
3VGT mAP68.8Unverified
#ModelMetricClaimedVerifiedStatus
1Faster_RCNNOverall0.96Unverified
2fglihaiOverall0.96Unverified
3Faster-RCNNOverall0.95Unverified
#ModelMetricClaimedVerifiedStatus
1fglihaiOverall0.92Unverified
2USYD NLP_CS29-2Overall0.92Unverified
3Faster-RCNNOverall0.91Unverified
#ModelMetricClaimedVerifiedStatus
1VisualWordGridFAR28.7Unverified