Dependency Tree Annotation with Mechanical Turk

2019-11-01WS 2019Unverified0· sign in to hype

Stephen Tratz

Unverified — Be the first to reproduce this paper.

Abstract

Crowdsourcing is frequently employed to quickly and inexpensively obtain valuable linguistic annotations but is rarely used for parsing, likely due to the perceived difficulty of the task and the limited training of the available workers. This paper presents what is, to the best of our knowledge, the first published use of Mechanical Turk (or similar platform) to crowdsource parse trees. We pay Turkers to construct unlabeled dependency trees for 500 English sentences using an interactive graphical dependency tree editor, collecting 10 annotations per sentence. Despite not requiring any training, several of the more prolific workers meet or exceed 90\% attachment agreement with the Penn Treebank (PTB) portion of our data, and, furthermore, for 72\% of these PTB sentences, at least one Turker produces a perfect parse. Thus, we find that, supported with a simple graphical interface, people with presumably no prior experience can achieve surprisingly high degrees of accuracy on this task. To facilitate research into aggregation techniques for complex crowdsourced annotations, we publicly release our annotated corpus.

Tasks

Sentence

Dependency Tree Annotation with Mechanical Turk

Abstract

Tasks

Reproductions