SOTAVerified

The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of Language Models

2023-11-02Code Available0· sign in to hype

Lovisa Hagström, Denitsa Saynova, Tobias Norlund, Moa Johansson, Richard Johansson

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Large Language Models (LLMs) make natural interfaces to factual knowledge, but their usefulness is limited by their tendency to deliver inconsistent answers to semantically equivalent questions. For example, a model might predict both "Anne Redpath passed away in Edinburgh." and "Anne Redpath's life ended in London." In this work, we identify potential causes of inconsistency and evaluate the effectiveness of two mitigation strategies: up-scaling and augmenting the LM with a retrieval corpus. Our results on the LLaMA and Atlas models show that both strategies reduce inconsistency while retrieval augmentation is considerably more efficient. We further consider and disentangle the consistency contributions of different components of Atlas. For all LMs evaluated we find that syntactical form and other evaluation task artifacts impact consistency. Taken together, our results provide a better understanding of the factors affecting the factual consistency of language models.

Tasks

Reproductions