Label-Wise Document Pre-Training for Multi-Label Text Classification

2020-08-15Code Available0· sign in to hype

Han Liu, Caixia Yuan, Xiaojie Wang

Code Available — Be the first to reproduce this paper.

Code

github.com/laddie132/LW-PT
OfficialIn paperpytorch★ 2

Abstract

A major challenge of multi-label text classification (MLTC) is to stimulatingly exploit possible label differences and label correlations. In this paper, we tackle this challenge by developing Label-Wise Pre-Training (LW-PT) method to get a document representation with label-aware information. The basic idea is that, a multi-label document can be represented as a combination of multiple label-wise representations, and that, correlated labels always cooccur in the same or similar documents. LW-PT implements this idea by constructing label-wise document classification tasks and trains label-wise document encoders. Finally, the pre-trained label-wise encoder is fine-tuned with the downstream MLTC task. Extensive experimental results validate that the proposed method has significant advantages over the previous state-of-the-art models and is able to discover reasonable label relationship. The code is released to facilitate other researchers.

Tasks

Classification Document Classification General Classification Multi Label Text Classification Multi-Label Text Classification text-classification Text Classification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
AAPD	LW-PT	Micro F1	72.8	—	Unverified

Label-Wise Document Pre-Training for Multi-Label Text Classification

Code

Abstract

Tasks

Benchmark Results

Reproductions