SOTAVerified

Dialogue Evaluation

Papers

Showing 9197 of 97 papers

TitleStatusHype
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog SystemsCode0
Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings0
Evaluating Coherence in Dialogue Systems using EntailmentCode0
Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses0
One "Ruler" for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning0
Towards an Automatic Turing Test: Learning to Evaluate Dialogue ResponsesCode0
Adversarial Learning for Neural Dialogue GenerationCode0
Show:102550
← PrevPage 10 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MDD-EvalSpearman Correlation0.51Unverified
2Lin-Reg (all)Spearman Correlation0.49Unverified
3USRSpearman Correlation0.42Unverified
4USR - DR (x = c)Spearman Correlation0.32Unverified
5USR - MLMSpearman Correlation0.31Unverified
6USR - DR (x = f)Spearman Correlation0.14Unverified
#ModelMetricClaimedVerifiedStatus
1Lin-Reg (all)Spearman Correlation0.54Unverified
2USR - DR (x = c)Spearman Correlation0.48Unverified
3USRSpearman Correlation0.47Unverified
4USR - MLMSpearman Correlation0.08Unverified
5USR - DR (x = f)Spearman Correlation-0.05Unverified