SOTAVerified

Open-Domain Dialog Evaluation using Follow-Ups Likelihood

2022-09-12COLING 2022Code Available0· sign in to hype

Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.

Tasks

Reproductions