Native Language Identification with User Generated Content
2018-10-01EMNLP 2018Unverified0· sign in to hype
Gili Goldin, Ella Rabinovich, Shuly Wintner
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We address the task of native language identification in the context of social media content, where authors are highly-fluent, advanced nonnative speakers (of English). Using both linguistically-motivated features and the characteristics of the social media outlet, we obtain high accuracy on this challenging task. We provide a detailed analysis of the features that sheds light on differences between native and nonnative speakers, and among nonnative speakers with different backgrounds.