SOTAVerified

Embedding Convolutions for Short Text Extreme Classification with Millions of Labels

2021-12-17ACL ARR December 2022Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we propose a convolutional architecture InceptionXML which is light-weight, yet powerful, and robust to the inherent lack of word-order in short-text queries in search and recommendation tasks. We demonstrate the efficacy of applying convolutions by recasting the operation along the embedding dimension instead of the word dimension as done in conventional usage of CNNs for text classification. Towards scaling our model to problems with millions of labels, we also propose InceptionXML+ framework. This addresses the shortcomings of the dynamic hard-negative mining framework in the recently proposed LightXML by improving the alignment between the label-shortlister and extreme classifier. InceptionXML+ is not only smaller than state-of-the-art deep extreme classifier, Astec, in terms of model size but also significantly outperforms it on popular benchmark datasets. For reproducibility, the code is made available as part of this submission.

Tasks

Reproductions