SOTAVerified

Cross-Lingual Training of Neural Models for Document Ranking

2020-11-01Findings of the Association for Computational LinguisticsUnverified0· sign in to hype

Peng Shi, He Bai, Jimmy Lin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: model-based relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other ``low resource'' approaches are competitive as well.

Tasks

Reproductions