Document Layout Analysis

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 99 papers

Title	Date	Tasks	Status	Hype
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction	Mar 21, 2025	CPUDocument Layout Analysis	CodeCode Available	9
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception	Oct 16, 2024	Document Layout Analysisdocument understanding	CodeCode Available	9
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis	Jun 2, 2022	Document Layout AnalysisObject Detection	CodeCode Available	8
A Large Dataset of Historical Japanese Documents with Complex Layouts	Apr 18, 2020	Document Layout Analysis	CodeCode Available	3
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis	Mar 20, 2025	Document Layout AnalysisDocument Summarization	CodeCode Available	2
Towards End-to-End Unified Scene Text Detection and Layout Analysis	Mar 28, 2022	Document Layout AnalysisScene Text Detection	CodeCode Available	2
Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis	Jan 22, 2024	Document Layout AnalysisDocument Summarization	CodeCode Available	2
BEiT: BERT Pre-Training of Image Transformers	Jun 15, 2021	Document Image ClassificationDocument Layout Analysis	CodeCode Available	2
PubLayNet: largest dataset ever for document layout analysis	Aug 16, 2019	ArticlesDocument Layout Analysis	CodeCode Available	2
LayoutLM: Pre-training of Text and Layout for Document Image Understanding	Dec 31, 2019	Document AIdocument-image-classification	CodeCode Available	2
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis	Aug 29, 2023	Document AIDocument Layout Analysis	CodeCode Available	1
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models	Mar 21, 2024	BenchmarkingDocument Layout Analysis	CodeCode Available	1
Training data-efficient image transformers & distillation through attention	Dec 23, 2020	Document Image ClassificationDocument Layout Analysis	CodeCode Available	1
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation	May 1, 2023	Document Layout Analysisobject-detection	CodeCode Available	1
DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis	Jul 6, 2021	Document Layout AnalysisImage Generation	CodeCode Available	1
DiT: Self-supervised Pre-training for Document Image Transformer	Mar 4, 2022	Document AIdocument-image-classification	CodeCode Available	1
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis	Apr 24, 2023	Document Layout AnalysisGraph Neural Network	CodeCode Available	1
M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis	Jan 1, 2023	ArticlesDocument Layout Analysis	CodeCode Available	1
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks	Aug 23, 2022	Document Layout Analysisdocument understanding	CodeCode Available	1
DocBank: A Benchmark Dataset for Document Layout Analysis	Jun 1, 2020	Document Layout Analysis	CodeCode Available	1
CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images	Aug 25, 2020	Document Layout AnalysisTable Detection	CodeCode Available	1
docExtractor: An off-the-shelf historical document element extraction	Dec 15, 2020	Document Layout AnalysisSegmentation	CodeCode Available	1
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer	Jan 27, 2022	Decision MakingDocument Layout Analysis	CodeCode Available	1
Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers	Feb 14, 2020	Document Layout AnalysisSemantic Segmentation	CodeCode Available	1
appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit	Oct 2, 2023	Document Layout Analysis	CodeCode Available	1
CTE: A Dataset for Contextualized Table Extraction	Feb 2, 2023	Document Layout AnalysisTable Detection	CodeCode Available	1
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documents	Jul 12, 2024	Document Layout Analysisdocument understanding	CodeCode Available	1
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis	Aug 22, 2022	Component ClassificationDocument Layout Analysis	CodeCode Available	1
Evaluation of a Region Proposal Architecture for Multi-task Document Layout Analysis	Jun 22, 2021	Document Layout AnalysisKeyword Spotting	—Unverified	0
Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection	May 10, 2023	Document Layout AnalysisInformation Retrieval	—Unverified	0
Framework and Model Analysis on Bengali Document Layout Analysis Dataset: BaDLAD	Aug 15, 2023	Document Layout Analysis	—Unverified	0
From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents	Jun 25, 2025	Document Layout Analysisobject-detection	—Unverified	0
Graph-based Document Structure Analysis	Feb 4, 2025	Document Layout AnalysisRelation	—Unverified	0
Human-In-The-Loop Document Layout Analysis	Aug 4, 2021	Document Layout AnalysisSemantic Segmentation	—Unverified	0
ICDAR 2024 Competition on Few-Shot and Many-Shot Layout Segmentation of Ancient Manuscripts (SAM)	Sep 11, 2024	DiversityDocument Layout Analysis	—Unverified	0
Improving Document Clustering by Removing Unnatural Language	Sep 1, 2017	ClusteringDocument Layout Analysis	—Unverified	0
VTLayout: Fusion of Visual and Text Features for Document Layout Analysis	Aug 12, 2021	Document Layout Analysis	—Unverified	0
A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court	May 13, 2025	DiversityDocument Layout Analysis	—Unverified	0
A Hybrid Approach for Document Layout Analysis in Document images	Apr 27, 2024	Contrastive LearningDecoder	—Unverified	0
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization	Mar 28, 2025	Document Layout Analysisobject-detection	—Unverified	0
AutoIE: An Automated Framework for Information Extraction from Scientific Literature	Jan 30, 2024	Document Layout AnalysisManagement	—Unverified	0
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs	May 12, 2025	BenchmarkingDocument Layout Analysis	—Unverified	0
Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach	Sep 2, 2023	Data AugmentationDocument Layout Analysis	—Unverified	0
Bengali Document Layout Analysis with Detectron2	Aug 26, 2023	Data AugmentationDocument Layout Analysis	—Unverified	0
Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images	Jun 23, 2023	Document Layout AnalysisObject	—Unverified	0
BROS: A Pre-trained Language Model for Understanding Texts in Document	Jan 1, 2021	DecoderDiversity	—Unverified	0
Callico: a Versatile Open-Source Document Image Annotation Platform	May 2, 2024	Document Layout AnalysisHTR	—Unverified	0
Cross-Domain Document Layout Analysis Using Document Style Guide	Jan 24, 2022	Contrastive LearningDocument Layout Analysis	—Unverified	0
Détection d'Objets dans les documents numérisés par réseaux de neurones profonds	Jan 27, 2023	Document Layout AnalysisLine Detection	—Unverified	0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications	Jun 12, 2024	document-image-classificationDocument Image Classification	—Unverified	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets PubLayNet val U-DIADS-Bib D4LA Document Layout Recognition Challenge mini-dev Document Layout Recognition Challenge test RVL-CDIP

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CDeC-Net	Table	0.98	—	Unverified
2	VGT	Overall	0.96	—	Unverified
3	TRDLU	Overall	0.96	—	Unverified
4	VSR	Overall	0.96	—	Unverified
5	DETR	Overall	0.96	—	Unverified
6	LayoutLMv3-B	Overall	0.95	—	Unverified
7	DiT-L	Overall	0.95	—	Unverified
8	DoPTA	Overall	0.95	—	Unverified
9	UDoc	Overall	0.94	—	Unverified
10	ResNext-101-32×8d	Overall	0.94	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CV-Group	Class Average IoU	83.4	—	Unverified
2	CNKI	Class Average IoU	77.8	—	Unverified
3	VAI-OCR	Class Average IoU	70.7	—	Unverified
4	DeepLabV3+	Class Average IoU	66.5	—	Unverified
5	L3i++	Class Average IoU (Few-shot setting)	61.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DoPTA	mAP	70.72	—	Unverified
2	DocLayout-YOLO	mAP	70.3	—	Unverified
3	VGT	mAP	68.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Faster_RCNN	Overall	0.96	—	Unverified
2	fglihai	Overall	0.96	—	Unverified
3	Faster-RCNN	Overall	0.95	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	fglihai	Overall	0.92	—	Unverified
2	USYD NLP_CS29-2	Overall	0.92	—	Unverified
3	Faster-RCNN	Overall	0.91	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	VisualWordGrid	FAR	28.7	—	Unverified