SOTAVerified

Scalable Knowledge Graph Construction from Text Collections

2019-11-01WS 2019Unverified0· sign in to hype

Ryan Clancy, Ihab F. Ilyas, Jimmy Lin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We present a scalable, open-source platform that ``distills'' a potentially large text collection into a knowledge graph. Our platform takes documents stored in Apache Solr and scales out the Stanford CoreNLP toolkit via Apache Spark integration to extract mentions and relations that are then ingested into the Neo4j graph database. The raw knowledge graph is then enriched with facts extracted from an external knowledge graph. The complete product can be manipulated by various applications using Neo4j's native Cypher query language: We present a subgraph-matching approach to align extracted relations with external facts and show that fact verification, locating textual support for asserted facts, detecting inconsistent and missing facts, and extracting distantly-supervised training data can all be performed within the same framework.

Tasks

Reproductions