Divide-and-Conquer Text Simplification by Scalable Data Enhancement

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Text simplification, whose aim is to reduce reading difficulty, can be decomposed into four discrete rewriting operations: substitution, deletion, reordering, and splitting. However, due to a large distribution discrepancy between existing training data and human-annotated data, models may learn improper operations, thus lead to poor generalization capabilities. In order to bridge this gap, we propose a novel data enhancement method, Simsim, that generates training pairs by simulating specific simplification operations. Experiments show that the models trained with Simsim outperform multiple strong baselines and achieve the better SARI on the Turk and Asset datasets. The newly constructed dataset Simsim is available at *.

Tasks

Text Simplification

Divide-and-Conquer Text Simplification by Scalable Data Enhancement

Abstract

Tasks

Reproductions