Multi-Class Grammatical Error Detection for Correction: A Tale of Two Systems

2021-11-01EMNLP 2021Code Available0· sign in to hype

Zheng Yuan, Shiva Taslimipoor, Christopher Davis, Christopher Bryant

Code Available — Be the first to reproduce this paper.

Code

github.com/gotutiyan/ged_baselines
pytorch★ 0

Abstract

In this paper, we show how a multi-class grammatical error detection (GED) system can be used to improve grammatical error correction (GEC) for English. Specifically, we first develop a new state-of-the-art binary detection system based on pre-trained ELECTRA, and then extend it to multi-class detection using different error type tagsets derived from the ERRANT framework. Output from this detection system is used as auxiliary input to fine-tune a novel encoder-decoder GEC model, and we subsequently re-rank the N-best GEC output to find the hypothesis that most agrees with the GED output. Results show that fine-tuning the GEC system using 4-class GED produces the best model, but re-ranking using 55-class GED leads to the best performance overall. This suggests that different multi-class GED systems benefit GEC in different ways. Ultimately, our system outperforms all other previous work that combines GED and GEC, and achieves a new single-model NMT-based state of the art on the BEA-test benchmark.

Tasks

Decoder Grammatical Error Correction Grammatical Error Detection NMT Re-Ranking

Multi-Class Grammatical Error Detection for Correction: A Tale of Two Systems

Code

Abstract

Tasks

Reproductions