SOTAVerified

Developing a Dataset of Overridden Information in Wikipedia

2022-06-01LREC 2022Unverified0· sign in to hype

Masatoshi Tsuchiya, Yasutaka Yokoi

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper proposes a new task of detecting information override. Since all information on the Web is not updated in a timely manner, the necessity is created for information that is overridden by another information source to be discarded. The task is formalized as a binary classification problem to determine whether a reference sentence has overridden a target sentence. In investigating this task, this paper describes a construction procedure for the dataset of overridden information by collecting sentence pairs from the difference between two versions of Wikipedia. Our developing dataset shows that the old version of Wikipedia contains much overridden information and that the detection of information override is necessary.

Tasks

Reproductions