SB@GU at the Complex Word Identification 2018 Shared Task

2018-06-01WS 2018Unverified0· sign in to hype

David Alfter, Ildik{\'o} Pil{\'a}n

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we describe our experiments for the Shared Task on Complex Word Identification (CWI) 2018 (Yimam et al., 2018), hosted by the 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at NAACL 2018. Our system for English builds on previous work for Swedish concerning the classification of words into proficiency levels. We investigate different features for English and compare their usefulness using feature selection methods. For the German, Spanish and French data we use simple systems based on character n-gram models and show that sometimes simple models achieve comparable results to fully feature-engineered systems.

Tasks

Complex Word Identification feature selection General Classification Language Modeling Language Modelling Text Simplification Word Embeddings

SB@GU at the Complex Word Identification 2018 Shared Task

Abstract

Tasks

Reproductions