SOTAVerified

OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification

2018-06-01WS 2018Unverified0· sign in to hype

Sowmya Vajjala, Ivana Lu{\v{c}}i{\'c}

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes the collection and compilation of the OneStopEnglish corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three versions (567 in total). The corpus is now freely available under a CC by-SA 4.0 license and we hope that it would foster further research on the topics of readability assessment and text simplification.

Tasks

Reproductions