Human vs Automatic Metrics: on the Importance of Correlation Design

2018-05-29Code Available0· sign in to hype

Anastasia Shimorina

Code Available — Be the first to reproduce this paper.

Code

gitlab.com/webnlg/webnlg-human-evaluation
Officialnone★ 0

Abstract

This paper discusses two existing approaches to the correlation analysis between automatic evaluation metrics and human scores in the area of natural language generation. Our experiments show that depending on the usage of a system- or sentence-level correlation analysis, correlation results between automatic scores and human judgments are inconsistent.

Tasks

Sentence Text Generation

Human vs Automatic Metrics: on the Importance of Correlation Design

Code

Abstract

Tasks

Reproductions