| MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue | Jun 19, 2022 | Dialogue EvaluationMME | —Unverified | 0 |
| Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian | Jun 3, 2022 | Binary ClassificationDialogue Evaluation | CodeCode Available | 1 |
| InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning | May 25, 2022 | Dialogue EvaluationDialogue Generation | CodeCode Available | 1 |
| RuNNE-2022 Shared Task: Recognizing Nested Named Entities | May 23, 2022 | Dialogue Evaluationnamed-entity-recognition | CodeCode Available | 1 |
| AdaCoach: A Virtual Coach for Training Customer Service Agents | Apr 27, 2022 | Dialogue Evaluation | —Unverified | 0 |
| What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation | Mar 25, 2022 | Dialogue EvaluationOpen-Domain Dialog | CodeCode Available | 0 |
| Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges | Mar 18, 2022 | Dialogue Evaluation | —Unverified | 0 |
| DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations | Mar 18, 2022 | Abstract Meaning RepresentationCoherence Evaluation | CodeCode Available | 0 |
| Achieving Reliable Human Assessment of Open-Domain Dialogue Systems | Mar 11, 2022 | Dialogue Evaluation | CodeCode Available | 0 |
| FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows | Feb 14, 2022 | Dialogue Evaluation | —Unverified | 0 |