SOTAVerified

What’s in a Name? Answer Equivalence For Open-Domain Question Answering

2021-11-01EMNLP 2021Code Available0· sign in to hype

Chenglei Si, Chen Zhao, Jordan Boyd-Graber

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

A flaw in QA evaluation is that annotations often only provide one gold answer. Thus, model predictions semantically equivalent to the answer but superficially different are considered incorrect. This work explores mining alias entities from knowledge bases and using them as additional gold answers (i.e., equivalent answers). We incorporate answers for two settings: evaluation with additional answers and model training with equivalent answers. We analyse three QA benchmarks: Natural Questions, TriviaQA, and SQuAD. Answer expansion increases the exact match score on all datasets for evaluation, while incorporating it helps model training over real-world datasets. We ensure the additional answers are valid through a human post hoc evaluation.

Tasks

Reproductions