Distilling Knowledge from Reader to Retriever for Question Answering
Gautier Izacard, Edouard Grave
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/facebookresearch/FiDOfficialpytorch★ 592
- github.com/lucidrains/marge-pytorchpytorch★ 76
- github.com/hackerchenzhuo/LaKopytorch★ 25
- github.com/FenQQQ/Fusion-in-decoderpytorch★ 1
Abstract
The task of information retrieval is an important component of many natural language processing systems, such as open domain question answering. While traditional methods were based on hand-crafted features, continuous representations based on neural networks recently obtained competitive results. A challenge of using such methods is to obtain supervised data to train the retriever model, corresponding to pairs of query and support documents. In this paper, we propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation, and which does not require annotated pairs of query and documents. Our approach leverages attention scores of a reader model, used to solve the task based on retrieved documents, to obtain synthetic labels for the retriever. We evaluate our method on question answering, obtaining state-of-the-art results.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| NarrativeQA | FiD+Distil | Rouge-L | 32 | — | Unverified |
| TriviaQA | FiD+Distil | EM | 72.1 | — | Unverified |