Marian: Cost-effective High-Quality Neural Machine Translation in C++

2018-05-30WS 2018Code Available0· sign in to hype

Marcin Junczys-Dowmunt, Kenneth Heafield, Hieu Hoang, Roman Grundkiewicz, Anthony Aue

Code Available — Be the first to reproduce this paper.

Code

github.com/MindSpore-scientific-2/code-14/tree/main/marian
mindspore★ 0

Abstract

This paper describes the submissions of the "Marian" team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.

Tasks

CPU GPU Machine Translation Translation Vocal Bursts Intensity Prediction

Marian: Cost-effective High-Quality Neural Machine Translation in C++

Code

Abstract

Tasks

Reproductions