Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues

2020-11-01EMNLP (CODI) 2020Unverified0· sign in to hype

Maitreyee Maitreyee

Unverified — Be the first to reproduce this paper.

Abstract

This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word embeddings and syntactic features, significantly improved the results.

Tasks

Clustering Dependency Parsing feature selection Word Embeddings

Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues

Abstract

Tasks

Reproductions