Improving Sentiment Classification Using 0-Shot Generated Labels for Custom Transformer Embeddings

2021-09-29Unverified0· sign in to hype

Ryan Bluteau, Robin Gras

Unverified — Be the first to reproduce this paper.

Abstract

We present an approach to improve sentiment classification for transformers (based on BERT and DistilBERT) using additional embeddings to represent emotion inputs. We used HuggingFace's 0-shot prediction pipeline to generate probabilities of whether emotions apply to a given sample. We generated 0-shot probabilities for 1.6 million samples from a sentiment classification dataset and a smaller sentiment airline dataset using 63 emotions. Then we added custom tokens to BERT's embeddings and tokenizers representing various levels of emotion for each predicted emotion. Finally, depending on the probability of each emotion, the respective custom token representing that level was prepended to the text input of the model to process and train for classification. We additionally test direct classification layer addition of emotion inputs and an ensemble of BERT and DistilBERT models both using emotion inputs achieving a modest increase in sentiment prediction accuracy. Our results show modest improvement in all cases over the original model for both BERT and DistilBERT tested with added emotion inputs generated from 0-shot pretrained models.

Tasks

Classification Sentiment Analysis Sentiment Classification

Improving Sentiment Classification Using 0-Shot Generated Labels for Custom Transformer Embeddings

Abstract

Tasks

Reproductions