SeCoDa: Sense Complexity Dataset

2020-05-01LREC 2020Unverified0· sign in to hype

David Strohmaier, Sian Gooding, Shiva Taslimipoor, Ekaterina Kochmar

Unverified — Be the first to reproduce this paper.

Abstract

The Sense Complexity Dataset (SeCoDa) provides a corpus that is annotated jointly for complexity and word senses. It thus provides a valuable resource for both word sense disambiguation and the task of complex word identification. The intention is that this dataset will be used to identify complexity at the level of word senses rather than word tokens. For word sense annotation SeCoDa uses a hierarchical scheme that is based on information available in the Cambridge Advanced Learner's Dictionary. This way we can offer more coarse-grained senses than directly available in WordNet.

Tasks

Complex Word Identification Word Sense Disambiguation

SeCoDa: Sense Complexity Dataset

Abstract

Tasks

Reproductions