SOTAVerified

Key Information Extraction

Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.

Papers

Showing 125 of 74 papers

TitleStatusHype
PaddleOCR 3.0 Technical ReportCode0
Class-Agnostic Region-of-Interest Matching in Document ImagesCode0
Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models0
Emergency Communication: OTFS-Based Semantic Transmission with Diffusion Noise Suppression0
KIEval: Evaluation Metric for Document Key Information Extraction0
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language ModelsCode0
CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy0
"What is the value of templates?" Rethinking Document Information Extraction Datasets for LLMs0
GraphRevisedIE: Multimodal Information Extraction with Graph-Revised NetworkCode0
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document UnderstandingCode1
See then Tell: Enhancing Key Information Extraction with Vision Grounding0
ViBERTgrid BiLSTM-CRF: Multimodal Key Information Extraction from Unstructured Financial Documents0
Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural NetworkCode0
Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review0
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document UnderstandingCode2
Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use0
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form ParserCode0
KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business DocumentsCode1
A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents0
RealKIE: Five Novel Datasets for Enterprise Key Information Extraction0
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table RecognitionCode0
TextMonkey: An OCR-Free Large Multimodal Model for Understanding DocumentCode5
Construction of a Syntactic Analysis Map for Yi Shui School through Text Mining and Natural Language Processing Research0
LAPDoc: Layout-Aware Prompting for Documents0
Different Tastes of Entities: Investigating Human Label Variation in Named Entity AnnotationsCode0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RORE (GeoLayoutLM)F198.52Unverified
2GeoLayoutLMF197.97Unverified
3LayoutLMv3 LargeF197.46Unverified
4LayoutMask (large)F197.19Unverified
5LayoutMask (base)F196.99Unverified
6TPP (LayoutMask)F196.92Unverified
7LILTF196.07Unverified
8LayoutLMv2LARGEF196.01Unverified
9LayoutLMv2BASEF194.95Unverified
#ModelMetricClaimedVerifiedStatus
1LayoutLMv2LARGE (Excluding OCR mismatch)F197.81Unverified
2RORE (GeoLayoutLM)F196.97Unverified
3LayoutLMv2LARGEF196.61Unverified
4LayoutLMv2BASEF196.25Unverified
5ChatGPT 3.5 SpatialFormatAccuracy77Unverified
#ModelMetricClaimedVerifiedStatus
1LayoutLMv2LARGEF185.2Unverified
2LayoutLMv2BASEF183.3Unverified
3LAMBERT (75M)F180.42Unverified
#ModelMetricClaimedVerifiedStatus
1DANF1 (%)95.05Unverified