SOTAVerified

TenTrans Multilingual Low-Resource Translation System for WMT21 Indo-European Languages Task

2021-11-01WMT (EMNLP) 2021Unverified0· sign in to hype

Han Yang, Bojie Hu, Wanying Xie, Ambyera Han, Pan Liu, Jinan Xu, Qi Ju

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes TenTrans’ submission to WMT21 Multilingual Low-Resource Translation shared task for the Romance language pairs. This task focuses on improving translation quality from Catalan to Occitan, Romanian and Italian, with the assistance of related high-resource languages. We mainly utilize back-translation, pivot-based methods, multilingual models, pre-trained model fine-tuning, and in-domain knowledge transfer to improve the translation quality. On the test set, our best-submitted system achieves an average of 43.45 case-sensitive BLEU scores across all low-resource pairs. Our data, code, and pre-trained models used in this work are available in TenTrans evaluation examples.

Tasks

Reproductions