| Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation | Apr 1, 2024 | Abstract Meaning RepresentationDialogue Evaluation | CodeCode Available | 0 |
| CodingTeachLLM: Empowering LLM's Coding Ability via AST Prior Knowledge | Mar 13, 2024 | Dialogue EvaluationHumanEval | —Unverified | 0 |
| A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators | Dec 24, 2023 | Dialogue Evaluation | CodeCode Available | 0 |
| xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark | Oct 13, 2023 | Dialogue EvaluationMachine Translation | CodeCode Available | 0 |
| RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue | Sep 15, 2023 | Dialogue EvaluationMulti-Task Learning | —Unverified | 0 |
| Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue Evaluation | Sep 14, 2023 | ChatbotDialogue Evaluation | CodeCode Available | 0 |
| Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation | Aug 31, 2023 | Dialogue Evaluation | CodeCode Available | 0 |
| Towards Multilingual Automatic Dialogue Evaluation | Aug 31, 2023 | Dialogue EvaluationMachine Translation | CodeCode Available | 0 |
| C-PMI: Conditional Pointwise Mutual Information for Turn-level Dialogue Evaluation | Jun 27, 2023 | Dialogue Evaluation | CodeCode Available | 0 |
| How to Choose How to Choose Your Chatbot: A Massively Multi-System MultiReference Data Set for Dialog Metric Evaluation | May 23, 2023 | ChatbotDialogue Evaluation | —Unverified | 0 |