CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way
2021-08-01SEMEVALUnverified0· sign in to hype
Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, H{\'e}ctor Hern{\'a}ndez, Elin Askl{\"o}v
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson's score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson's score of 0.7925).