"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection

2017-05-01ACL 2017Code Available1· sign in to hype

William Yang Wang

Code Available — Be the first to reproduce this paper.

Code

github.com/manideep2510/siamese-BERT-fake-news-detection-LIAR
pytorch★ 152
github.com/SindhuMadi/FakeNewsDetection
none★ 20
github.com/JelenaBanjac/lies-have-short-legs
none★ 0
github.com/ExploringLies/lies-have-short-legs
none★ 0
github.com/mansoor9743/Fake-News-Detection
tf★ 0
github.com/siddarthhari95/NLP-Fake-News_detection
tf★ 0
github.com/ekagra-ranjan/fake-news-detection-LIAR-pytorch
pytorch★ 0
github.com/kiranrawat/Detecting-Fake-News-On-Social-Media
none★ 0
github.com/JelenaBanjac/AppliedDataAnalysis
none★ 0
github.com/bedarkarpriyanka/NLP-Project-Fake-News-Detection
tf★ 0

Abstract

Automatic fake news detection is a challenging problem in deception detection, and it has tremendous real-world political and social impacts. However, statistical approaches to combating fake news has been dramatically limited by the lack of labeled benchmark datasets. In this paper, we present liar: a new, publicly available dataset for fake news detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from PolitiFact.com, which provides detailed analysis report and links to source documents for each case. This dataset can be used for fact-checking research as well. Notably, this new dataset is an order of magnitude larger than previously largest public fake news datasets of similar type. Empirically, we investigate automatic fake news detection based on surface-level linguistic patterns. We have designed a novel, hybrid convolutional neural network to integrate meta-data with text. We show that this hybrid approach can improve a text-only deep learning model.

Tasks

Deception Detection Fact Checking Fake News Detection

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
LIAR	Hybrid CNNs (Text + All)	Test Accuracy	0.27	—	Unverified
LIAR	CNNs	Test Accuracy	0.27	—	Unverified
LIAR	Hybrid CNNs (Text + Speaker)	Test Accuracy	0.25	—	Unverified
LIAR	Bi-LSTMs	Test Accuracy	0.23	—	Unverified

"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection

Code

Abstract

Tasks

Benchmark Results

Reproductions