Unsupervised Fine-tuning for Text Clustering

2020-12-01COLING 2020Unverified0· sign in to hype

Shaohan Huang, Furu Wei, Lei Cui, Xingxing Zhang, Ming Zhou

Unverified — Be the first to reproduce this paper.

Abstract

Fine-tuning with pre-trained language models (e.g. BERT) has achieved great success in many language understanding tasks in supervised settings (e.g. text classification). However, relatively little work has been focused on applying pre-trained models in unsupervised settings, such as text clustering. In this paper, we propose a novel method to fine-tune pre-trained models unsupervisedly for text clustering, which simultaneously learns text representations and cluster assignments using a clustering oriented loss. Experiments on three text clustering datasets (namely TREC-6, Yelp, and DBpedia) show that our model outperforms the baseline methods and achieves state-of-the-art results.

Tasks

Clustering text-classification Text Classification Text Clustering

Unsupervised Fine-tuning for Text Clustering

Abstract

Tasks

Reproductions