SOTAVerified

Automatic Translation of Scientific Documents in the HAL Archive

2012-05-01LREC 2012Unverified0· sign in to hype

Patrik Lambert, Holger Schwenk, Fr{\'e}d{\'e}ric Blain

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes the development of a statistical machine translation system between French and English for scientific papers. This system will be closely integrated into the French HAL open archive, a collection of more than 100.000 scientific papers. We describe the creation of in-domain parallel and monolingual corpora, the development of a domain specific translation system with the created resources, and its adaptation using monolingual resources only. These techniques allowed us to improve a generic system by more than 10 BLEU points.

Tasks

Reproductions