SOTAVerified

VizSeq: A Visual Analysis Toolkit for Text Generation Tasks

2019-09-12IJCNLP 2019Code Available0· sign in to hype

Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Automatic evaluation of text generation tasks (e.g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE. They, however, are abstract numbers and are not perfectly aligned with human assessment. This suggests inspecting detailed examples as a complement to identify system error patterns. In this paper, we present VizSeq, a visual analysis toolkit for instance-level and corpus-level system evaluation on a wide variety of text generation tasks. It supports multimodal sources and multiple text references, providing visualization in Jupyter notebook or a web app interface. It can be used locally or deployed onto public servers for centralized data hosting and benchmarking. It covers most common n-gram based metrics accelerated with multiprocessing, and also provides latest embedding-based metrics such as BERTScore.

Tasks

Reproductions