Testing the Ability of Language Models to Interpret Figurative Language
Emmy Liu, Chen Cui, Kenneth Zheng, Graham Neubig
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/nightingal3/fig-qaOfficialIn paperpytorch★ 23
- github.com/simran-khanuja/multilingual-fig-qapytorch★ 2
Abstract
Figurative and metaphorical language are commonplace in discourse, and figurative expressions play an important role in communication and cognition. However, figurative language has been a relatively under-studied area in NLP, and it remains an open question to what extent modern language models can interpret nonliteral phrases. To address this question, we introduce Fig-QA, a Winograd-style nonliteral language understanding task consisting of correctly interpreting paired figurative phrases with divergent meanings. We evaluate the performance of several state-of-the-art language models on this task, and find that although language models achieve performance significantly over chance, they still fall short of human performance, particularly in zero- or few-shot settings. This suggests that further work is needed to improve the nonliteral reasoning capabilities of language models.