SOTAVerified

Document Intelligence Metrics for Visually Rich Document Evaluation

2022-05-23Code Available1· sign in to hype

Jonathan Degange, Swapnil Gupta, Zhuoyu Han, Krzysztof Wilkosz, Adam Karwan

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

The processing of Visually-Rich Documents (VRDs) is highly important in information extraction tasks associated with Document Intelligence. We introduce DI-Metrics, a Python library devoted to VRD model evaluation comprising text-based, geometric-based and hierarchical metrics for information extraction tasks. We apply DI-Metrics to evaluate information extraction performance using publicly available CORD dataset, comparing performance of three SOTA models and one industry model. The open-source library is available on GitHub.

Tasks

Reproductions