| PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment | Dec 18, 2022 | Data AugmentationDialogue Evaluation | —Unverified | 0 |
| Pragmatically Appropriate Diversity for Dialogue Evaluation | Apr 6, 2023 | Dialogue EvaluationDiversity | —Unverified | 0 |
| Predicting Ratings of Real Dialogue Participants from Artificial Data and Ratings of Human Dialogue Observers | May 1, 2020 | Dialogue Evaluation | —Unverified | 0 |
| Dialogue Evaluation with Offline Reinforcement Learning | Sep 2, 2022 | Dialogue EvaluationOffline RL | —Unverified | 0 |
| RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue | Sep 15, 2023 | Dialogue EvaluationMulti-Task Learning | —Unverified | 0 |
| Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses | Feb 23, 2019 | Dialogue EvaluationResponse Generation | —Unverified | 0 |
| Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges | Mar 18, 2022 | Dialogue Evaluation | —Unverified | 0 |
| Dialogue You Can Trust: Human and AI Perspectives on Generated Conversations | Sep 3, 2024 | Dialogue Evaluation | —Unverified | 0 |
| DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation | Jun 4, 2025 | Dialogue Evaluationvalid | —Unverified | 0 |
| Enhancing the Open-Domain Dialogue Evaluation in Latent Space | Aug 1, 2021 | Dialogue Evaluation | —Unverified | 0 |