DetIE: Multilingual Open Information Extraction Inspired by Object Detection

2022-06-24Code Available1· sign in to hype

Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey Nikolenko

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/sberbank-ai/DetIE
OfficialIn papernone★ 20

Abstract

State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorithms from computer vision. We use an order-agnostic loss based on bipartite matching that forces unique predictions and a Transformer-based encoder-only architecture for sequence labeling. The proposed approach is faster and shows superior or similar performance in comparison with state of the art models on standard benchmarks in terms of both quality metrics and inference time. Our model sets the new state of the art performance of 67.7% F1 on CaRB evaluated as OIE2016 while being 3.35x faster at inference than previous state of the art. We also evaluate the multilingual version of our model in the zero-shot setting for two languages and introduce a strategy for generating synthetic multilingual data to fine-tune the model for each specific language. In this setting, we show performance improvement 15% on multilingual Re-OIE2016, reaching 75% F1 for both Portuguese and Spanish languages. Code and models are available at https://github.com/sberbank-ai/DetIE.

Tasks

Multilingual NLP Object object-detection Open Information Extraction

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
LSOIE	DetIELSOIE	F1	71.4	—	Unverified
LSOIE	CIGL-OIE	F1	59.7	—	Unverified
LSOIE	DetIELSOIE + IGL-CA	F1	58.7	—	Unverified
LSOIE	DetIEIMoJIE	F1	55.7	—	Unverified
LSOIE	OpenIE4	F1	54.6	—	Unverified
LSOIE	OpenIE6 (CIGL-OIE + IGL-CA)	F1	51.6	—	Unverified
LSOIE	OpenIE5	F1	49.5	—	Unverified
LSOIE	DetIEIMoJIE (ours) + IGL-CA	F1	45.9	—	Unverified
LSOIE	OllIE Mausam et al. (2012)	F1	36.8	—	Unverified

DetIE: Multilingual Open Information Extraction Inspired by Object Detection

Code

Abstract

Tasks

Benchmark Results

Reproductions