SOTAVerified

A Quantitative Study of Data in the NLP community

2017-04-01WS 2017Unverified0· sign in to hype

Margot Mieskes

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We present results on a quantitative analysis of publications in the NLP domain on collecting, publishing and availability of research data. We find that a wide range of publications rely on data crawled from the web, but few give details on how potentially sensitive data was treated. Additionally, we find that while links to repositories of data are given, they often do not work even a short time after publication. We put together several suggestions on how to improve this situation based on publications from the NLP domain, but also other research areas.

Tasks

Reproductions