| Transformers for Headline Selection for Russian News Clusters | Jun 19, 2021 | Dialogue EvaluationSentence | CodeCode Available | 0 |
| Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation | Jun 10, 2021 | Binary ClassificationDialogue Evaluation | CodeCode Available | 0 |
| A Comprehensive Assessment of Dialog Evaluation Metrics | Jun 7, 2021 | Dialogue EvaluationResponse Generation | CodeCode Available | 1 |
| Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation | Jun 5, 2021 | Dialogue EvaluationOpen-Domain Dialog | CodeCode Available | 0 |
| Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances | Jun 4, 2021 | ChatbotDialogue Evaluation | CodeCode Available | 1 |
| DynaEval: Unifying Turn and Dialogue Level Evaluation | Jun 2, 2021 | Dialogue Evaluation | CodeCode Available | 1 |
| Generating Negative Samples by Manipulating Golden Responses for Unsupervised Learning of a Response Evaluation Model | Jun 1, 2021 | Dialogue Evaluation | CodeCode Available | 0 |
| Towards Quantifiable Dialogue Coherence Evaluation | Jun 1, 2021 | Coherence EvaluationDialogue Evaluation | CodeCode Available | 1 |
| Assessing Dialogue Systems with Distribution Distances | May 6, 2021 | Dialogue Evaluation | CodeCode Available | 1 |
| DCH-2: A Parallel Customer-Helpdesk Dialogue Corpus with Distributions of Annotators' Labels | Apr 18, 2021 | Dialogue EvaluationMachine Translation | —Unverified | 0 |