SOTAVerified

IndeGx: A Model and a Framework for Indexing RDF Knowledge Graphs with SPARQL-based Test Suits

2023-01-23Journal of Web Semantics 2023Code Available0· sign in to hype

Pierre Maillot, Olivier Corby, Catherine Faron, Fabien Gandon, Franck Michel

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In recent years, a large number of RDF datasets have been built and published on the Web in fields as diverse as linguistics or life sciences, as well as general datasets such as DBpedia or Wikidata. The joint exploitation of these datasets requires specific knowledge about their content, access points, and commonalities. However, not all datasets contain a self-description, and not all access points can handle the complex queries used to generate such a description. In this article, we provide a standard-based approach to generate the description of a dataset. The generated descriptions as well as the process of their computation are expressed using standard vocabularies and languages. We implemented our approach into a framework, called IndeGx, where each indexing feature and its computation is collaboratively and declaratively defined in a GitHub repository. We have experimented IndeGx on a set of 339 RDF datasets with endpoints listed in public catalogs, over 8 months. The results show that we can collect, as much as possible, important characteristics of the datasets depending on their availability and capacities. The resulting index captures the commonalities, variety, and disparity in the offered content and services and it provides important support to any application designed to query RDF datasets.

Tasks

Reproductions