Dialogue Evaluation
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|
| 1 | MDD-Eval | Spearman Correlation | 0.51 | — | Unverified |
| 2 | Lin-Reg (all) | Spearman Correlation | 0.49 | — | Unverified |
| 3 | USR | Spearman Correlation | 0.42 | — | Unverified |
| 4 | USR - DR (x = c) | Spearman Correlation | 0.32 | — | Unverified |
| 5 | USR - MLM | Spearman Correlation | 0.31 | — | Unverified |
| 6 | USR - DR (x = f) | Spearman Correlation | 0.14 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|
| 1 | Lin-Reg (all) | Spearman Correlation | 0.54 | — | Unverified |
| 2 | USR - DR (x = c) | Spearman Correlation | 0.48 | — | Unverified |
| 3 | USR | Spearman Correlation | 0.47 | — | Unverified |
| 4 | USR - MLM | Spearman Correlation | 0.08 | — | Unverified |
| 5 | USR - DR (x = f) | Spearman Correlation | -0.05 | — | Unverified |