SOTAVerified

QASports: A Question Answering Dataset about Sports

2023-09-25Simpósio Brasileiro de Bancos de Dados - Dataset Showcase Workshop 2023Code Available0· sign in to hype

Pedro Calciolari Jardim, Leonardo Mauro Pereira Moraes, Cristina Dutra Aguiar

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Sport is one of the most popular and revenue-generating forms of entertainment. Therefore, analyzing data related to this domain introduces several opportunities for Question Answering (QA) systems, such as supporting tactical decision-making. But, to develop and evaluate QA systems, researchers and developers need datasets that contain questions and their corresponding answers. In this paper, we focus on this issue. We propose QASports, the first large sports question answering dataset for extractive answer questions. QASports contains more than 1.5 million triples of questions, answers, and context about three popular sports: soccer, American football, and basketball. We describe the QASports processes of data collection and questions and answers generation. We also describe the characteristics of the QASports data. Furthermore, we analyze the sources used to obtain raw data and investigate the usability of QASports by issuing "wh-queries". Moreover, we describe scenarios for using QASports, highlighting its importance for training and evaluating QA systems.

Tasks

Reproductions