Adaptive Unsupervised Self-training for Disfluency Detection

2022-10-01COLING 2022Code Available0· sign in to hype

Zhongyuan Wang, YiXuan Wang, Shaolei Wang, Wanxiang Che

Code Available — Be the first to reproduce this paper.

Code

github.com/wyxstriker/reweightingdisfluency
OfficialIn paperpytorch★ 7

Abstract

Supervised methods have achieved remarkable results in disfluency detection. However, in real-world scenarios, human-annotated data is difficult to obtain. Recent works try to handle disfluency detection with unsupervised self-training, which can exploit existing large-scale unlabeled data efficiently. However, their self-training-based methods suffer from the problems of selection bias and error accumulation. To tackle these problems, we propose an adaptive unsupervised self-training method for disfluency detection. Specifically, we re-weight the importance of each training example according to its grammatical feature and prediction confidence. Experiments on the Switchboard dataset show that our method improves 2.3 points over the current SOTA unsupervised method. Moreover, our method is competitive with the SOTA supervised method.

Tasks

Selection bias

Adaptive Unsupervised Self-training for Disfluency Detection

Code

Abstract

Tasks

Reproductions