Is Cross-lingual Evaluation Only About Cross-lingual?

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Multilingual pre-trained language models (mPLMs) have achieved great success on various cross-lingual tasks. However, we find that the higher performance on these tasks cannot be regarded as the better cross-lingual ability because models’ task-specific abilities can also influence the performance. In this work, we do a comprehensive study on two representative cross-lingual evaluation protocols: sentence retrieval and zero-shot transfer. We find that current cross-lingual evaluation results strongly depend on mPLMs' task-specific abilities so that the performance can be improved without any improvement in models' cross-lingual ability. To have more accurate comparisons of cross-lingual ability between mPLMs, we propose two new indexes based on the two evaluation protocols: calibrated sentence retrieval performance and transfer rate, and experimentally show that our proposed indexes effectively eliminate the effects of task-specific abilities on the cross-lingual evaluation.

Tasks

Retrieval Sentence Sentence Retrieval

Is Cross-lingual Evaluation Only About Cross-lingual?

Abstract

Tasks

Reproductions