SOTAVerified

Exploring Optimal Voting in Native Language Identification

2017-09-01WS 2017Unverified0· sign in to hype

Cyril Goutte, Serge L{\'e}ger

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We describe the submissions entered by the National Research Council Canada in the NLI-2017 evaluation. We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems. We also explored the use of features that rely on no linguistic preprocessing. Long ngrams of characters obtained from raw text turned out to yield the best performance on all textual input (written essays and speech transcripts). Voting ensembles turned out to produce small performance gains, with little difference between the various optimization strategies we tried. Our top systems achieved accuracies of 87\% on the essay track, 84\% on the speech track, and close to 92\% by combining essays, speech and i-vectors in the fusion track.

Tasks

Reproductions