SOTAVerified

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

2020-09-16EMNLP 2020Code Available1· sign in to hype

Tara Safavi, Danai Koutra

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark. Data, code, and pretrained models are available at https://bit.ly/2EPbrJs.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
CoDEx LargeTransEMRR0.19Unverified
CoDEx LargeComplExMRR0.29Unverified
CoDEx LargeTuckERMRR0.31Unverified
CoDEx LargeConvEMRR0.3Unverified
CoDEx LargeRESCALMRR0.3Unverified
CoDEx MediumConvEMRR0.32Unverified
CoDEx MediumComplExMRR0.34Unverified
CoDEx MediumTuckERMRR0.33Unverified
CoDEx MediumRESCALMRR0.32Unverified
CoDEx MediumTransEMRR0.3Unverified
CoDEx SmallTuckERMRR0.44Unverified
CoDEx SmallTransEMRR0.35Unverified
CoDEx SmallComplExMRR0.4Unverified
CoDEx SmallRESCALMRR0.4Unverified
CoDEx SmallConvEMRR0.44Unverified

Reproductions