SOTAVerified

Hierarchical Latent Word Clustering

2016-01-20Unverified0· sign in to hype

Halid Ziya Yerebakan, Fitsum Reda, Yiqiang Zhan, Yoshihisa Shinagawa

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if they share similar distribution over documents. In our experiments, we observed meaningful hierarchical structures on NIPS corpus and radiology reports collected from public repositories.

Tasks

Reproductions