Key Information Extraction

Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 74 papers

Title	Date	Tasks	Status	Hype	Score
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document	Mar 7, 2024	document understandingKey Information Extraction	CodeCode Available	5	5
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding	Jul 2, 2024	document understandingKey Information Extraction	CodeCode Available	2	5
LayoutLM: Pre-training of Text and Layout for Document Image Understanding	Dec 31, 2019	Document AIdocument-image-classification	CodeCode Available	2	5
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding	Feb 28, 2022	Document Image Classificationdocument understanding	CodeCode Available	2	5
OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models	May 13, 2023	Key Information ExtractionNutrition	CodeCode Available	2	5
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks	Apr 16, 2020	Graph LearningKey Information Extraction	CodeCode Available	1	5
Key Information Extraction From Documents: Evaluation And Generator	Jun 9, 2021	DecoderKey Information Extraction	CodeCode Available	1	5
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks	Aug 23, 2022	Document Layout Analysisdocument understanding	CodeCode Available	1	5
GenKIE: Robust Generative Multimodal Document Key Information Extraction	Oct 24, 2023	DecoderKey Information Extraction	CodeCode Available	1	5
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding	Oct 12, 2022	document-image-classificationDocument Image Classification	CodeCode Available	1	5
KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents	May 1, 2024	DiversityKey Information Extraction	CodeCode Available	1	5
LAMBERT: Layout-Aware (Language) Modeling for information extraction	Feb 19, 2020	Key Information ExtractionLanguage Modeling	CodeCode Available	1	5
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents	Aug 10, 2021	Key Information ExtractionLanguage Modeling	CodeCode Available	1	5
Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction	Oct 17, 2023	Entity LinkingKey Information Extraction	CodeCode Available	1	5
Form-NLU: Dataset for the Form Natural Language Understanding	Apr 4, 2023	4kForm	CodeCode Available	1	5
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding	Sep 29, 2024	document understandingEntity Linking	CodeCode Available	1	5
DocILE Benchmark for Document Information Localization and Extraction	Feb 11, 2023	Key Information ExtractionUnsupervised Pre-training	CodeCode Available	1	5
Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation	Oct 25, 2023	Handwritten Text RecognitionKey Information Extraction	CodeCode Available	1	5
PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction	Jan 7, 2024	Key Information ExtractionKey-value Pair Extraction	CodeCode Available	1	5
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding	Dec 29, 2020	Document Image ClassificationDocument Layout Analysis	CodeCode Available	0	5
AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification	Sep 18, 2023	ClassificationKey Information Extraction	CodeCode Available	0	5
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	Apr 18, 2022	cross-modal alignmentDocument AI	CodeCode Available	0	5
MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding	Aug 14, 2021	Key Information Extractionnamed-entity-recognition	CodeCode Available	0	5
Multimodal weighted graph representation for information extraction from visually rich documents.	Jan 5, 2024	Document Layout Analysisdocument understanding	CodeCode Available	0	5
Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations	Feb 2, 2024	Key Information Extractionnamed-entity-recognition	CodeCode Available	0	5
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition	Mar 28, 2024	Decoderdocument understanding	CodeCode Available	0	5
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition	Jan 1, 2024	Decoderdocument understanding	CodeCode Available	0	5
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models	Feb 22, 2025	document understandingKey Information Extraction	CodeCode Available	0	5
PaddleOCR 3.0 Technical Report	Jul 8, 2025	document understandingKey Information Extraction	CodeCode Available	0	5
PP-StructureV2: A Stronger Document Analysis System	Oct 11, 2022	Key Information ExtractionKnowledge Distillation	CodeCode Available	0	5
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction	Apr 21, 2023	Document AIentity_extraction	CodeCode Available	0	5
GraphRevisedIE: Multimodal Information Extraction with Graph-Revised Network	Oct 2, 2024	Key Information Extraction	CodeCode Available	0	5
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction	Mar 18, 2021	Key Information ExtractionOptical Character Recognition (OCR)	CodeCode Available	0	5
Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network	Sep 11, 2024	Document Layout Analysisdocument understanding	CodeCode Available	0	5
Spatial Dual-Modality Graph Reasoning for Key Information Extraction	Mar 26, 2021	Key Information ExtractionTemplate Matching	CodeCode Available	0	5
Information Redundancy and Biases in Public Document Information Extraction Benchmarks	Apr 28, 2023	document understandingKey Information Extraction	CodeCode Available	0	5
Class-Agnostic Region-of-Interest Matching in Document Images	Jun 26, 2025	Document Layout Analysisdocument understanding	CodeCode Available	0	5
DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-Loop	Nov 9, 2022	Document AIKey Information Extraction	CodeCode Available	0	5
Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations	Jul 1, 2021	Key Information ExtractionOptical Character Recognition (OCR)	CodeCode Available	0	5
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser	May 27, 2024	Document AIForm	CodeCode Available	0	5
See then Tell: Enhancing Key Information Extraction with Vision Grounding	Sep 29, 2024	Image to textKey Information Extraction	—Unverified	0	0
A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents	Apr 16, 2024	document understandingKey Information Extraction	—Unverified	0	0
CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy	Dec 3, 2024	HallucinationKey Information Extraction	—Unverified	0	0
Construction of a Syntactic Analysis Map for Yi Shui School through Text Mining and Natural Language Processing Research	Feb 16, 2024	graph constructionInformation Retrieval	—Unverified	0	0
Data Efficient Training of a U-Net Based Architecture for Structured Documents Localization	Oct 2, 2023	DecoderDeep Learning	—Unverified	0	0
Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review	Jul 23, 2024	Deep Learningdocument understanding	—Unverified	0	0
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency	Nov 9, 2023	document understandingKey Information Extraction	—Unverified	0	0
DUBLIN -- Document Understanding By Language-Image Network	May 23, 2023	Document Classificationdocument understanding	—Unverified	0	0
Emergency Communication: OTFS-Based Semantic Transmission with Diffusion Noise Suppression	Apr 10, 2025	DenoisingKey Information Extraction	—Unverified	0	0
End-to-End Document Classification and Key Information Extraction using Assignment Optimization	Jun 1, 2023	ClassificationDocument Classification	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets CORD SROIE Kleister NDA SIMARA

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	RORE (GeoLayoutLM)	F1	98.52	—	Unverified
2	GeoLayoutLM	F1	97.97	—	Unverified
3	LayoutLMv3 Large	F1	97.46	—	Unverified
4	LayoutMask (large)	F1	97.19	—	Unverified
5	LayoutMask (base)	F1	96.99	—	Unverified
6	TPP (LayoutMask)	F1	96.92	—	Unverified
7	LILT	F1	96.07	—	Unverified
8	LayoutLMv2LARGE	F1	96.01	—	Unverified
9	LayoutLMv2BASE	F1	94.95	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LayoutLMv2LARGE (Excluding OCR mismatch)	F1	97.81	—	Unverified
2	RORE (GeoLayoutLM)	F1	96.97	—	Unverified
3	LayoutLMv2LARGE	F1	96.61	—	Unverified
4	LayoutLMv2BASE	F1	96.25	—	Unverified
5	ChatGPT 3.5 SpatialFormat	Accuracy	77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LayoutLMv2LARGE	F1	85.2	—	Unverified
2	LayoutLMv2BASE	F1	83.3	—	Unverified
3	LAMBERT (75M)	F1	80.42	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DAN	F1 (%)	95.05	—	Unverified