What confuses BERT? Linguistic Evaluation of Sentiment Analysis on Telecom Customer Opinion

2021-10-01ROCLING 2021Unverified0· sign in to hype

Cing-Fang Shih, Yu-Hsiang Tseng, Ching-Wen Yang, Pin-Er Chen, Hsin-Yu Chou, Lian-Hui Tan, Tzu-Ju Lin, Chun-Wei Wang, Shu-Kai Hsieh

arXiv PDF

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Ever-expanding evaluative texts on online forums have become an important source of sentiment analysis. This paper proposes an aspect-based annotated dataset consisting of telecom reviews on social media. We introduce a category, implicit evaluative texts, impevals for short, to investigate how the deep learning model works on these implicit reviews. We first compare two models, BertSimple and BertImpvl, and find that while both models are competent to learn simple evaluative texts, they are confused when classifying impevals. To investigate the factors underlying the correctness of the model’s predictions, we conduct a series of analyses, including qualitative error analysis and quantitative analysis of linguistic features with logistic regressions. The results show that local features that affect the overall sentential sentiment confuse the model: multiple target entities, transitional words, sarcasm, and rhetorical questions. Crucially, these linguistic features are independent of the model’s confidence measured by the classifier’s softmax probabilities. Interestingly, the sentence complexity indicated by syntax-tree depth is not correlated with the model’s correctness. In sum, this paper sheds light on the characteristics of the modern deep learning model and when it might need more supervision through linguistic evaluations.

Tasks

Sentence Sentiment Analysis

What confuses BERT? Linguistic Evaluation of Sentiment Analysis on Telecom Customer Opinion

Abstract

Tasks

Reproductions