Knowledge Distillation Improves Stability in Retranslation-based Simultaneous Translation

2021-12-17ACL ARR December 2022Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

In simultaneous translation, the retranslation approach has the advantage of requiring no modifications to the inference engine. However in order to reduce the undesirable instability (flicker) in the output, previous work has resorted to increasing the latency through masking, and introducing specialised inference, losing the simplicity of the approach. In this paper, we argue that the flicker is caused by both non-monotonicity of the training data, and by non-determinism of the resulting model. Both of these can be addressed using knowledge distillation. We evaluate our approach using simultaneously interpreted test sets for English-German and English-Czech and demonstrate that the distilled models have an improved flicker-latency tradeoff, with quality similar to the original.

Tasks

Knowledge Distillation Translation

Knowledge Distillation Improves Stability in Retranslation-based Simultaneous Translation

Abstract

Tasks

Reproductions