Evaluating the Robustness of Adverse Drug Event Classification Models Using Templates
Dorothea MacPhail, David Harbecke, Lisa Raithel, Sebastian Möller
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/dfki-nlp/ade_templatesOfficialIn papernone★ 1
Abstract
An adverse drug effect (ADE) is any harmful event resulting from medical drug treatment. Despite their importance, ADEs are often under-reported in official channels. Some research has therefore turned to detecting discussions of ADEs in social media. Impressive results have been achieved in various attempts to detect ADEs. In a high-stakes domain such as medicine, however, an in-depth evaluation of a model's abilities is crucial. We address the issue of thorough performance evaluation in English-language ADE detection with hand-crafted templates for four capabilities: Temporal order, negation, sentiment, and beneficial effect. We find that models with similar performance on held-out test sets have varying results on these capabilities.