SOTAVerified|Agents Browse Leaderboard About

Dialogue Evaluation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 97 papers

Title	Date	Tasks	Status	Hype
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation	Mar 25, 2022	Dialogue EvaluationOpen-Domain Dialog	CodeCode Available	0
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges	Mar 18, 2022	Dialogue Evaluation	—Unverified	0
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations	Mar 18, 2022	Abstract Meaning RepresentationCoherence Evaluation	CodeCode Available	0
Achieving Reliable Human Assessment of Open-Domain Dialogue Systems	Mar 11, 2022	Dialogue Evaluation	CodeCode Available	0
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows	Feb 14, 2022	Dialogue Evaluation	—Unverified	0
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents	Jan 12, 2022	Dialogue EvaluationSensitivity	—Unverified	0
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation	Dec 14, 2021	Dialogue Evaluation	CodeCode Available	0
User Response and Sentiment Prediction for Automatic Dialogue Evaluation	Nov 16, 2021	Dialogue EvaluationOpen-Domain Dialog	—Unverified	0
GCDF1: A Goal- and Context- Driven F-Score for Evaluating User Models	Nov 1, 2021	Dialogue EvaluationTask-Oriented Dialogue Systems	CodeCode Available	0
Proxy Indicators for the Quality of Open-domain Dialogues	Nov 1, 2021	Dialogue Evaluation	CodeCode Available	0

Show:10 25 50

← PrevPage 7 of 10Next →

All datasets USR-TopicalChat USR-PersonaChat

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MDD-Eval	Spearman Correlation	0.51	—	Unverified
2	Lin-Reg (all)	Spearman Correlation	0.49	—	Unverified
3	USR	Spearman Correlation	0.42	—	Unverified
4	USR - DR (x = c)	Spearman Correlation	0.32	—	Unverified
5	USR - MLM	Spearman Correlation	0.31	—	Unverified
6	USR - DR (x = f)	Spearman Correlation	0.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Lin-Reg (all)	Spearman Correlation	0.54	—	Unverified
2	USR - DR (x = c)	Spearman Correlation	0.48	—	Unverified
3	USR	Spearman Correlation	0.47	—	Unverified
4	USR - MLM	Spearman Correlation	0.08	—	Unverified
5	USR - DR (x = f)	Spearman Correlation	-0.05	—	Unverified