Rethinking and Refining the Distinct Metric

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Distinct is a widely used automatic metric for evaluating the diversity of language generation tasks. However, we observe that the original approach to calculating distinct scores has evident biases that tend to add higher penalties to longer sequences. In this paper, we refine the calculation of distinct scores by re-scaling the number of distinct tokens based on its expectation. We provide both empirical and theoretical evidence to show that our method effectively removes the biases exhibited in the original distinct score. Further analyses also demonstrate that the refined score correlates better with human evaluations.

Tasks

Diversity Text Generation

Rethinking and Refining the Distinct Metric

Abstract

Tasks

Reproductions