Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque

2020-05-01LREC 2020Unverified0· sign in to hype

Maddalen L{\'o}pez de Lacalle, Xabier Saralegi, I{\~n}aki San Vicente

Unverified — Be the first to reproduce this paper.

Abstract

This paper presents an approach for developing a task-oriented dialog system for less-resourced languages in scenarios where training data is not available. Both intent classification and slot filling are tackled. We project the existing annotations in rich-resource languages by means of Neural Machine Translation (NMT) and posterior word alignments. We then compare training on the projected monolingual data with direct model transfer alternatives. Intent Classifiers and slot filling sequence taggers are implemented using a BiLSTM architecture or by fine-tuning BERT transformer models. Models learnt exclusively from Basque projected data provide better accuracies for slot filling. Combining Basque projected train data with rich-resource languages data outperforms consistently models trained solely on projected data for intent classification. At any rate, we achieve competitive performance in both tasks, with accuracies of 81\% for intent classification and 77\% for slot filling.

Tasks

Classification General Classification intent-classification Intent Classification Intent Classification and Slot Filling Machine Translation NMT slot-filling Slot Filling Translation

Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque

Abstract

Tasks

Reproductions