SOTAVerified

A3-108 Machine Translation System for LoResMT Shared Task @MT Summit 2021 Conference

2021-08-01MTSummit 2021Unverified0· sign in to hype

Saumitra Yadav, Manish Shrivastava

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we describe our submissions for LoResMT Shared Task @MT Summit 2021 Conference. We built statistical translation systems in each direction for English ⇐⇒ Marathi language pair. This paper outlines initial baseline experiments with various tokenization schemes to train models. Using optimal tokenization scheme we create synthetic data and further train augmented dataset to create more statistical models. Also, we reorder English to match Marathi syntax to further train another set of baseline and data augmented models using various tokenization schemes. We report configuration of the submitted systems and results produced by them.

Tasks

Reproductions