Aligning Multilingual Embeddings for Improved Code-switched Natural Language Understanding

2022-10-01COLING 2022Code Available0· sign in to hype

Barah Fazili, Preethi Jyothi

Code Available — Be the first to reproduce this paper.

Code

github.com/barahfazili/alignmentforcs
OfficialIn paperpytorch★ 1

Abstract

Multilingual pretrained models, while effective on monolingual data, need additional training to work well with code-switched text. In this work, we present a novel idea of training multilingual models with alignment objectives using parallel text so as to explicitly align word representations with the same underlying semantics across languages. Such an explicit alignment step has a positive downstream effect and improves performance on multiple code-switched NLP tasks. We explore two alignment strategies and report improvements of up to 7.32%, 0.76% and 1.9% on Hindi-English Sentiment Analysis, Named Entity Recognition and Question Answering tasks compared to a competitive baseline model.

Tasks

named-entity-recognition Named Entity Recognition Named Entity Recognition (NER)Natural Language Understanding Question Answering Sentiment Analysis

Aligning Multilingual Embeddings for Improved Code-switched Natural Language Understanding

Code

Abstract

Tasks

Reproductions