Towards Robust Passage Re-Ranking Model by Mitigating Lexical Match Bias
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
While deep learning models can overcome the limitations of traditional machine learning algorithms that use hand-crafted features, recent studies have shown that these models often achieve high dataset-specific accuracy by exploiting several bias without understanding deeper semantics of intended task. In this paper, we show that neural re-ranking models in information retrieval field are easily deceived by passage and query which exist some lexical match but are semantically irrelevant or less relevant. Then we create a challenging evaluation dataset to expose model's inability in fully capturing contextual semantic information to learn sequence representation for relevance match. After that in order to encourage model to focus on semantic match between passage and query, we explore adversarial removal method to mitigate model's tendency in learning shortcut of lexical match bias in training corpus. Experiments on two benchmarks show that our debiased models can both achieve certain gains on original and challenging test sets compared with baseline models.