How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

2021-08-01ACL (WAT) 2021Unverified0· sign in to hype

Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, Anders Søgaard

Unverified — Be the first to reproduce this paper.

Abstract

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.

Tasks

GPU Translation

How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

Abstract

Tasks

Reproductions