SOTAVerified

Reducing Model Churn: Stable Re-training of Conversational Agents

2022-09-01SIGDIAL (ACL) 2022Code Available0· sign in to hype

Christopher Hidey, Fei Liu, Rahul Goel

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Retraining modern deep learning systems can lead to variations in model performance even when trained using the same data and hyper-parameters by simply using different random seeds. This phenomenon is known as model churn or model jitter. This issue is often exacerbated in real world settings, where noise may be introduced in the data collection process. In this work we tackle the problem of stable retraining with a novel focus on structured prediction for conversational semantic parsing. We first quantify the model churn by introducing metrics for agreement between predictions across multiple retrainings. Next, we devise realistic scenarios for noise injection and demonstrate the effectiveness of various churn reduction techniques such as ensembling and distillation. Lastly, we discuss practical trade-offs between such techniques and show that co-distillation provides a sweet spot in terms of churn reduction with only a modest increase in resource usage.

Tasks

Reproductions