Attention Is (not) All You Need for Commonsense Reasoning

2019-05-31ACL 2019Code Available1· sign in to hype

Tassilo Klein, Moin Nabi

Code Available — Be the first to reproduce this paper.

Code

github.com/SAP-samples/acl2020-commonsense
pytorch★ 29
github.com/SAP-samples/acl2019-commonsense-reasoning
pytorch★ 14

Abstract

The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora.

Tasks

All Coreference Resolution Natural Language Understanding

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Winograd Schema Challenge	BERT-base 110M + MAS	Accuracy	60.3	—	Unverified
Winograd Schema Challenge	USSM + Supervised DeepNet + KB	Accuracy	52.8	—	Unverified
Winograd Schema Challenge	USSM + KB	Accuracy	52	—	Unverified

Attention Is (not) All You Need for Commonsense Reasoning

Code

Abstract

Tasks

Benchmark Results

Reproductions