| Generating Negative Samples by Manipulating Golden Responses for Unsupervised Learning of a Response Evaluation Model | Jun 1, 2021 | Dialogue Evaluation | CodeCode Available | 0 |
| Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation | Jun 5, 2021 | Dialogue EvaluationOpen-Domain Dialog | CodeCode Available | 0 |
| Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References | Jul 24, 2019 | Dialogue EvaluationDiversity | CodeCode Available | 0 |
| MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation | Dec 14, 2021 | Dialogue Evaluation | CodeCode Available | 0 |
| Measuring the Robustness of Reference-Free Dialogue Evaluation Systems | Jan 12, 2025 | Dialogue EvaluationTAG | CodeCode Available | 0 |
| MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Chatbots and Dialogue Evaluators | May 28, 2025 | BenchmarkingChatbot | CodeCode Available | 0 |
| Methods for Recognizing Nested Terms | Apr 22, 2025 | Dialogue Evaluationnamed-entity-recognition | CodeCode Available | 0 |
| PairEval: Open-domain Dialogue Evaluation with Pairwise Comparison | Apr 1, 2024 | Dialogue Evaluation | CodeCode Available | 0 |
| Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems | Nov 4, 2019 | Dialogue Evaluation | CodeCode Available | 0 |
| Proxy Indicators for the Quality of Open-domain Dialogues | Nov 1, 2021 | Dialogue Evaluation | CodeCode Available | 0 |
| RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts | Apr 9, 2025 | Dialogue EvaluationLanguage Modeling | CodeCode Available | 0 |
| SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation | Aug 17, 2022 | Contrastive LearningDialogue Evaluation | CodeCode Available | 0 |
| Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation | Aug 31, 2023 | Dialogue Evaluation | CodeCode Available | 0 |
| SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation | May 24, 2024 | Contrastive LearningDialogue Evaluation | CodeCode Available | 0 |
| Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs | Aug 20, 2024 | Dialogue Evaluation | CodeCode Available | 0 |
| Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation | Apr 1, 2024 | Abstract Meaning RepresentationDialogue Evaluation | CodeCode Available | 0 |
| Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation | Jun 10, 2021 | Binary ClassificationDialogue Evaluation | CodeCode Available | 0 |
| Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses | Aug 23, 2017 | Dialogue Evaluation | CodeCode Available | 0 |
| Towards Multilingual Automatic Dialogue Evaluation | Aug 31, 2023 | Dialogue EvaluationMachine Translation | CodeCode Available | 0 |
| Transformers for Headline Selection for Russian News Clusters | Jun 19, 2021 | Dialogue EvaluationSentence | CodeCode Available | 0 |
| What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation | Mar 25, 2022 | Dialogue EvaluationOpen-Domain Dialog | CodeCode Available | 0 |
| Towards Best Experiment Design for Evaluating Dialogue System Output | Sep 23, 2019 | Dialogue Evaluation | CodeCode Available | 0 |