NUBIA: NeUral Based Interchangeability Assessor for Text Generation

2020-04-30ACL (EvalNLGEval, INLG) 2020Code Available1· sign in to hype

Hassan Kane, Muhammed Yusuf Kocyigit, Ali Abdalla, Pelkins Ajanoh, Mohamed Coulibali

Code Available — Be the first to reproduce this paper.

Code

github.com/wl-research/nubia
pytorch★ 53

Abstract

We present NUBIA, a methodology to build automatic evaluation metrics for text generation using only machine learning models as core components. A typical NUBIA model is composed of three modules: a neural feature extractor, an aggregator and a calibrator. We demonstrate an implementation of NUBIA which outperforms metrics currently used to evaluate machine translation, summaries and slightly exceeds/matches state of the art metrics on correlation with human judgement on the WMT segment-level Direct Assessment task, sentence-level ranking and image captioning evaluation. The model implemented is modular, explainable and set to continuously improve over time.

Tasks

BIG-bench Machine Learning Image Captioning Machine Translation Sentence Text Generation Translation

NUBIA: NeUral Based Interchangeability Assessor for Text Generation

Code

Abstract

Tasks

Reproductions