SOTAVerified

The Development of the Multilingual LUNA Corpus for Spoken Language System Porting

2014-05-01LREC 2014Unverified0· sign in to hype

Evgeny Stepanov, Giuseppe Riccardi, Ali Orkan Bayer

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The development of annotated corpora is a critical process in the development of speech applications for multiple target languages. While the technology to develop a monolingual speech application has reached satisfactory results (in terms of performance and effort), porting an existing application from a source language to a target language is still a very expensive task. In this paper we address the problem of creating multilingual aligned corpora and its evaluation in the context of a spoken language understanding (SLU) porting task. We discuss the challenges of the manual creation of multilingual corpora, as well as present the algorithms for the creation of multilingual SLU via Statistical Machine Translation (SMT).

Tasks

Reproductions