SOTAVerified

Document Layout Analysis

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Papers

Showing 7699 of 99 papers

TitleStatusHype
ICDAR 2021 Competition on Historical Map SegmentationCode0
Document Domain Randomization for Deep Learning Document Layout Extraction0
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and RelationsCode0
Document Layout Analysis via Dynamic Residual Feature Fusion0
BROS: A Pre-trained Language Model for Understanding Texts in Document0
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document UnderstandingCode0
Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks0
Training data-efficient image transformers & distillation through attentionCode1
docExtractor: An off-the-shelf historical document element extractionCode1
Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks0
VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach0
CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document ImagesCode1
DocBank: A Benchmark Dataset for Document Layout AnalysisCode1
A Large Dataset of Historical Japanese Documents with Complex LayoutsCode3
Combining Visual and Textual Features for Semantic Segmentation of Historical NewspapersCode1
LayoutLM: Pre-training of Text and Layout for Document Image UnderstandingCode2
Visual Detection with Context for Document Layout Analysis0
PubLayNet: largest dataset ever for document layout analysisCode2
Multi-Task Handwritten Document Layout AnalysisCode0
dhSegment: A generic deep-learning approach for document segmentationCode0
Improving Document Clustering by Removing Unnatural Language0
DIVA-HisDB: A Precisely Annotated Large Dataset of Challenging Medieval Manuscripts0
Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents0
Parameter-free Geometric Document Layout Analysis0
Show:102550
← PrevPage 4 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CDeC-NetTable0.98Unverified
2VGTOverall0.96Unverified
3TRDLUOverall0.96Unverified
4VSROverall0.96Unverified
5DETROverall0.96Unverified
6LayoutLMv3-BOverall0.95Unverified
7DiT-LOverall0.95Unverified
8DoPTAOverall0.95Unverified
9UDocOverall0.94Unverified
10ResNext-101-32×8dOverall0.94Unverified
#ModelMetricClaimedVerifiedStatus
1CV-GroupClass Average IoU83.4Unverified
2CNKIClass Average IoU77.8Unverified
3VAI-OCRClass Average IoU70.7Unverified
4DeepLabV3+Class Average IoU66.5Unverified
5L3i++Class Average IoU (Few-shot setting)61.1Unverified
#ModelMetricClaimedVerifiedStatus
1DoPTA mAP70.72Unverified
2DocLayout-YOLO mAP70.3Unverified
3VGT mAP68.8Unverified
#ModelMetricClaimedVerifiedStatus
1Faster_RCNNOverall0.96Unverified
2fglihaiOverall0.96Unverified
3Faster-RCNNOverall0.95Unverified
#ModelMetricClaimedVerifiedStatus
1fglihaiOverall0.92Unverified
2USYD NLP_CS29-2Overall0.92Unverified
3Faster-RCNNOverall0.91Unverified
#ModelMetricClaimedVerifiedStatus
1VisualWordGridFAR28.7Unverified