SOTAVerified

Solving the AL Chicken-and-Egg Corpus and Model Problem: Model-free Active Learning for Phenomena-driven Corpus Construction

2016-05-01LREC 2016Unverified0· sign in to hype

Dain Kaplan, Neil Rubens, Simone Teufel, Takenobu Tokunaga

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Active learning (AL) is often used in corpus construction (CC) for selecting ``informative'' documents for annotation. This is ideal for focusing annotation efforts when all documents cannot be annotated, but has the limitation that it is carried out in a closed-loop, selecting points that will improve an existing model. For phenomena-driven and exploratory CC, the lack of existing-models and specific task(s) for using it make traditional AL inapplicable. In this paper we propose a novel method for model-free AL utilising characteristics of phenomena for applying AL to select documents for annotation. The method can also supplement traditional closed-loop AL-based CC to extend the utility of the corpus created beyond a single task. We introduce our tool, MOVE, and show its potential with a real world case-study.

Tasks

Reproductions