WIQA: A dataset for ``What if...'' reasoning over procedural text

2019-11-01IJCNLP 2019Unverified0· sign in to hype

T, Niket on, Bhavana Dalvi, Keisuke Sakaguchi, Peter Clark, Antoine Bosselut

Unverified — Be the first to reproduce this paper.

Abstract

We introduce WIQA, the first large-scale dataset of ``What if...'' questions over procedural text. WIQA contains a collection of paragraphs, each annotated with multiple influence graphs describing how one change affects another, and a large (40k) collection of ``What if...?'' multiple-choice questions derived from these. For example, given a paragraph about beach erosion, would stormy weather hasten or decelerate erosion? WIQA contains three kinds of questions: perturbations to steps mentioned in the paragraph; external (out-of-paragraph) perturbations requiring commonsense knowledge; and irrelevant (no effect) perturbations. We find that state-of-the-art models achieve 73.8\% accuracy, well below the human performance of 96.3\%. We analyze the challenges, in particular tracking chains of influences, and present the dataset as an open challenge to the community.

Tasks

Multiple-choice

WIQA: A dataset for ``What if...'' reasoning over procedural text

Abstract

Tasks

Reproductions