Quda: Natural Language Queries for Visual Data Analytics

2020-05-07Unverified0· sign in to hype

Siwei Fu, Kai Xiong, Xiaodong Ge, Siliang Tang, Wei Chen, Yingcai Wu

Unverified — Be the first to reproduce this paper.

Abstract

The identification of analytic tasks from free text is critical for visualization-oriented natural language interfaces (V-NLIs) to suggest effective visualizations. However, it is challenging due to the ambiguity and complexity nature of human language. To address this challenge, we present a new dataset, called Quda, that aims to help V-NLIs recognize analytic tasks from free-form natural language by training and evaluating cutting-edge multi-label classification models. Our dataset contains 14,035 diverse user queries, and each is annotated with one or multiple analytic tasks. We achieve this goal by first gathering seed queries with data analysts and then employing extensive crowd force for paraphrase generation and validation. We demonstrate the usefulness of Quda through three applications. This work is the first attempt to construct a large-scale corpus for recognizing analytic tasks. With the release of Quda, we hope it will boost the research and development of V-NLIs in data analysis and visualization.

Tasks

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION Natural Language Queries Paraphrase Generation

Quda: Natural Language Queries for Visual Data Analytics

Abstract

Tasks

Reproductions