Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation

2016-12-01COLING 2016Unverified0· sign in to hype

Kai Yang, Yi Cai, Zhenhong Chen, Ho-fung Leung, Raymond Lau

Unverified — Be the first to reproduce this paper.

Abstract

Latent Dirichlet Allocation (LDA) and its variants have been widely used to discover latent topics in textual documents. However, some of topics generated by LDA may be noisy with irrelevant words scattering across these topics. We name this kind of words as topic-indiscriminate words, which tend to make topics more ambiguous and less interpretable by humans. In our work, we propose a new topic model named TWLDA, which assigns low weights to words with low topic discriminating power (ability). Our experimental results show that the proposed approach, which effectively reduces the number of topic-indiscriminate words in discovered topics, improves the effectiveness of LDA.

Tasks

Topic Models

Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation

Abstract

Tasks

Reproductions