SOTAVerified|Agents Browse Leaderboard About

Dialogue Evaluation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 97 papers

Title	Date	Tasks	Status	Hype
Learning the Human Judgment for the Automatic Evaluation of Chatbot	May 1, 2020	ChatbotDialogue Evaluation	—Unverified	0
Learning an Unreferenced Metric for Online Dialogue Evaluation	May 1, 2020	Dialogue Evaluation	CodeCode Available	1
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation	May 1, 2020	Dialogue EvaluationOpen-Domain Dialog	CodeCode Available	1
PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative Dialogue Systems	Apr 6, 2020	Dialogue Evaluation	CodeCode Available	1
How to Evaluate the Next System: Automatic Dialogue Evaluation from the Perspective of Continual Learning	Dec 10, 2019	Continual LearningDialogue Evaluation	—Unverified	0
Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems	Nov 4, 2019	Dialogue Evaluation	CodeCode Available	0
Towards Best Experiment Design for Evaluating Dialogue System Output	Sep 23, 2019	Dialogue Evaluation	CodeCode Available	0
ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons	Sep 6, 2019	Dialogue Evaluation	—Unverified	0
Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References	Jul 24, 2019	Dialogue EvaluationDiversity	CodeCode Available	0
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems	Jun 21, 2019	Dialogue EvaluationKnowledge Distillation	CodeCode Available	0

Show:10 25 50

← PrevPage 9 of 10Next →

All datasets USR-TopicalChat USR-PersonaChat

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MDD-Eval	Spearman Correlation	0.51	—	Unverified
2	Lin-Reg (all)	Spearman Correlation	0.49	—	Unverified
3	USR	Spearman Correlation	0.42	—	Unverified
4	USR - DR (x = c)	Spearman Correlation	0.32	—	Unverified
5	USR - MLM	Spearman Correlation	0.31	—	Unverified
6	USR - DR (x = f)	Spearman Correlation	0.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Lin-Reg (all)	Spearman Correlation	0.54	—	Unverified
2	USR - DR (x = c)	Spearman Correlation	0.48	—	Unverified
3	USR	Spearman Correlation	0.47	—	Unverified
4	USR - MLM	Spearman Correlation	0.08	—	Unverified
5	USR - DR (x = f)	Spearman Correlation	-0.05	—	Unverified