Investigating Post-pretraining Representation Alignment for Cross-Lingual Question Answering

2021-09-24EMNLP (MRQA) 2021Code Available0· sign in to hype

Fahim Faisal, Antonios Anastasopoulos

Code Available — Be the first to reproduce this paper.

Code

github.com/ffaisal93/aligned_qa
OfficialIn paper★ 0

Abstract

Human knowledge is collectively encoded in the roughly 6500 languages spoken around the world, but it is not distributed equally across languages. Hence, for information-seeking question answering (QA) systems to adequately serve speakers of all languages, they need to operate cross-lingually. In this work we investigate the capabilities of multilingually pre-trained language models on cross-lingual QA. We find that explicitly aligning the representations across languages with a post-hoc fine-tuning step generally leads to improved performance. We additionally investigate the effect of data size as well as the language choice in this fine-tuning step, also releasing a dataset for evaluating cross-lingual QA systems. Code and dataset are publicly available here: https://github.com/ffaisal93/aligned_qa

Tasks

Cross-Lingual Question Answering Question Answering

Investigating Post-pretraining Representation Alignment for Cross-Lingual Question Answering

Code

Abstract

Tasks

Reproductions