RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

2020-06-28Code Available1· sign in to hype

Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma

Code Available — Be the first to reproduce this paper.

Code

github.com/jingtaozhan/RepBERT-Index
OfficialIn paperpytorch★ 66
github.com/jingtaozhan/repconc
pytorch★ 119
github.com/jingtaozhan/JPQ
pytorch★ 52

Abstract

Although exact term match between queries and documents is the dominant method to perform first-stage retrieval, we propose a different approach, called RepBERT, to represent documents and queries with fixed-length contextualized embeddings. The inner products of query and document embeddings are regarded as relevance scores. On MS MARCO Passage Ranking task, RepBERT achieves state-of-the-art results among all initial retrieval techniques. And its efficiency is comparable to bag-of-words methods.

Tasks

Passage Ranking Retrieval

RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

Code

Abstract

Tasks

Reproductions