SOTAVerified

Text Categorization by Learning Predominant Sense of Words as Auxiliary Task

2019-07-01ACL 2019Code Available0· sign in to hype

Kazuya Shimura, Jiyi Li, Fumiyo Fukumoto

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Distributions of the senses of words are often highly skewed and give a strong influence of the domain of a document. This paper follows the assumption and presents a method for text categorization by leveraging the predominant sense of words depending on the domain, i.e., domain-specific senses. The key idea is that the features learned from predominant senses are possible to discriminate the domain of the document and thus improve the overall performance of text categorization. We propose multi-task learning framework based on the neural network model, transformer, which trains a model to simultaneously categorize documents and predicts a predominant sense for each word. The experimental results using four benchmark datasets show that our method is comparable to the state-of-the-art categorization approach, especially our model works well for categorization of multi-label documents.

Tasks

Reproductions