Modeling Text using the Continuous Space Topic Model with Pre-Trained Word Embeddings
Seiichi Inoue, Taichi Aida, Mamoru Komachi, Manabu Asai
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
In this study, we propose a model that extends the continuous space topic model (CSTM), which flexibly controls word probability in a document, using pre-trained word embeddings. To develop the proposed model, we pre-train word embeddings, which capture the semantics of words and plug them into the CSTM. Intrinsic experimental results show that the proposed model exhibits a superior performance over the CSTM in terms of perplexity and convergence speed. Furthermore, extrinsic experimental results show that the proposed model is useful for a document classification task when compared with the baseline model. We qualitatively show that the latent coordinates obtained by training the proposed model are better than those of the baseline model.