SOTAVerified|Agents Browse Leaderboard About Blog

Binary text classification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 20 papers

Title	Date	Tasks	Status	Hype
MAGE: Machine-generated Text Detection in the Wild	May 22, 2023	Binary text classificationFace Swapping	CodeCode Available	2
LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?	Jan 11, 2024	Binary text classification	CodeCode Available	2
TweepFake: about Detecting Deepfake Tweets	Jul 31, 2020	Binary text classificationDeepFake Detection	CodeCode Available	1
Ghostbuster: Detecting Text Ghostwritten by Large Language Models	May 24, 2023	ArticlesBinary text classification	CodeCode Available	1
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation	Sep 27, 2021	Authorship AttributionBinary text classification	CodeCode Available	1
Active Learning for BERT: An Empirical Study	Nov 1, 2020	Active LearningBinary text classification	CodeCode Available	1
GigaCheck: Detecting LLM-generated Content	Oct 31, 2024	Binary text classificationBoundary Detection	—Unverified	0
Analyzing the Generalizability of Deep Contextualized Language Representations For Text Classification	Mar 22, 2023	Binary text classificationNews Classification	—Unverified	0
Neural Legal Judgment Prediction in English	Jun 5, 2019	Binary text classificationGeneral Classification	—Unverified	0
Calibrated Large Language Models for Binary Question Answering	Jul 1, 2024	Binary text classificationQuestion Answering	—Unverified	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets Ghostbuster (All Domains)MAGE (Arbitrary-domains & Arbitrary-models)MixSet (Binary)TURINGBENCH (Turing Test, FAIR_wmt20)TURINGBENCH (Turing Test, GPT-3)TweepFake ECHR Non-Anonymized

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	F1 score	1	—	Unverified
2	Ghostbuster	F1 score	0.99	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	Average Recall	0.96	—	Unverified
2	Longformer	Average Recall	0.91	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	F1 score	0.99	—	Unverified
2	Radar	F1 score	0.88	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	F1 score	1	—	Unverified
2	RoBERTa	F1 score	0.45	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	F1 score	0.97	—	Unverified
2	RoBERTa	F1 score	0.52	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GigaCheck (Mistral-7B)	F1 score	0.94	—	Unverified
2	XLNet	F1 score	0.88	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HIER-BERT	Macro F1	82	—	Unverified