emrKBQA: A Clinical Knowledge-Base Question Answering Dataset

2021-06-01NAACL (BioNLP) 2021Code Available0· sign in to hype

Preethi Raghavan, Jennifer J Liang, Diwakar Mahajan, Rachita Chandra, Peter Szolovits

Code Available — Be the first to reproduce this paper.

Code

github.com/emrqa/emrkbqa
OfficialIn papernone★ 3

Abstract

We present emrKBQA, a dataset for answering physician questions from a structured patient record. It consists of questions, logical forms and answers. The questions and logical forms are generated based on real-world physician questions and are slot-filled and answered from patients in the MIMIC-III KB through a semi-automated process. This community-shared release consists of over 940000 question, logical form and answer triplets with 389 types of questions and ~7.5 paraphrases per question type. We perform experiments to validate the quality of the dataset and set benchmarks for question to logical form learning that helps answer questions on this dataset.

Tasks

Clinical Knowledge Form Knowledge Base Question Answering Question Answering

emrKBQA: A Clinical Knowledge-Base Question Answering Dataset

Code

Abstract

Tasks

Reproductions