SOTAVerified

Document Layout Analysis

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Papers

Showing 150 of 99 papers

TitleStatusHype
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data ConstructionCode9
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive PerceptionCode9
DocLayNet: A Large Human-Annotated Dataset for Document-Layout AnalysisCode8
A Large Dataset of Historical Japanese Documents with Complex LayoutsCode3
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure AnalysisCode2
Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure AnalysisCode2
Towards End-to-End Unified Scene Text Detection and Layout AnalysisCode2
BEiT: BERT Pre-Training of Image TransformersCode2
LayoutLM: Pre-training of Text and Layout for Document Image UnderstandingCode2
PubLayNet: largest dataset ever for document layout analysisCode2
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documentsCode1
RoDLA: Benchmarking the Robustness of Document Layout Analysis ModelsCode1
appjsonify: An Academic Paper PDF-to-JSON Conversion ToolkitCode1
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout AnalysisCode1
SelfDocSeg: A Self-Supervised vision-based Approach towards Document SegmentationCode1
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysisCode1
CTE: A Dataset for Contextualized Table ExtractionCode1
M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout AnalysisCode1
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural NetworksCode1
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout AnalysisCode1
DiT: Self-supervised Pre-training for Document Image TransformerCode1
DocSegTr: An Instance-Level End-to-End Document Image Segmentation TransformerCode1
DocSynth: A Layout Guided Approach for Controllable Document Image SynthesisCode1
Training data-efficient image transformers & distillation through attentionCode1
docExtractor: An off-the-shelf historical document element extractionCode1
CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document ImagesCode1
DocBank: A Benchmark Dataset for Document Layout AnalysisCode1
Combining Visual and Textual Features for Semantic Segmentation of Historical NewspapersCode1
Class-Agnostic Region-of-Interest Matching in Document ImagesCode0
From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents0
SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation0
A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court0
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs0
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization0
SFDLA: Source-Free Document Layout AnalysisCode0
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation0
Graph-based Document Structure Analysis0
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning0
DoPTA: Improving Document Layout Analysis using Patch-Text Alignment0
Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural NetworkCode0
ICDAR 2024 Competition on Few-Shot and Many-Shot Layout Segmentation of Ancient Manuscripts (SAM)0
PdfTable: A Unified Toolkit for Deep Learning-Based Table ExtractionCode0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications0
UnSupDLA: Towards Unsupervised Document Layout Analysis0
Towards Unified Multi-granularity Text Detection with Interactive Attention0
DLAFormer: An End-to-End Transformer For Document Layout Analysis0
Callico: a Versatile Open-Source Document Image Annotation Platform0
A Hybrid Approach for Document Layout Analysis in Document images0
Text Role Classification in Scientific Charts Using Multimodal TransformersCode0
AutoIE: An Automated Framework for Information Extraction from Scientific Literature0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CDeC-NetTable0.98Unverified
2VGTOverall0.96Unverified
3TRDLUOverall0.96Unverified
4VSROverall0.96Unverified
5DETROverall0.96Unverified
6LayoutLMv3-BOverall0.95Unverified
7DiT-LOverall0.95Unverified
8DoPTAOverall0.95Unverified
9UDocOverall0.94Unverified
10ResNext-101-32×8dOverall0.94Unverified
#ModelMetricClaimedVerifiedStatus
1CV-GroupClass Average IoU83.4Unverified
2CNKIClass Average IoU77.8Unverified
3VAI-OCRClass Average IoU70.7Unverified
4DeepLabV3+Class Average IoU66.5Unverified
5L3i++Class Average IoU (Few-shot setting)61.1Unverified
#ModelMetricClaimedVerifiedStatus
1DoPTA mAP70.72Unverified
2DocLayout-YOLO mAP70.3Unverified
3VGT mAP68.8Unverified
#ModelMetricClaimedVerifiedStatus
1Faster_RCNNOverall0.96Unverified
2fglihaiOverall0.96Unverified
3Faster-RCNNOverall0.95Unverified
#ModelMetricClaimedVerifiedStatus
1fglihaiOverall0.92Unverified
2USYD NLP_CS29-2Overall0.92Unverified
3Faster-RCNNOverall0.91Unverified
#ModelMetricClaimedVerifiedStatus
1VisualWordGridFAR28.7Unverified