Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks such as open question answering and fact verification. These models are trained to generate a final output given retrieved passages that can be irrelevant to an input query, leading to learning spurious cues or memorization. This work introduces a method to incorporate the evidentiality of passages---whether a passage contains correct evidence to support the output---into training the generator. We introduce a multi-task learning framework to jointly generate the final output and predict the evidentiality of each passage. We introduce a new task-agnostic method for obtaining high-quality silver evidentiality labels, addressing the issues of gold evidentiality labels being unavailable in most domains. Our experiments on five datasets across three knowledge-intensive tasks of open-domain question answering, fact verification, and knowledge-enhanced dialogue show that our new evidentiality-guided generator significantly outperforms its direct counterpart on all of them, and advances the state of the art on three of them. Our analysis shows that multi-task learning and silver evidentiality mining played key roles.