SOTAVerified

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Showing 2130 of 71 papers

TitleStatusHype
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation0
Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation0
A Snapshot of NLG Evaluation Practices 2005 - 20140
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?0
A Survey of Natural Language Generation0
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text0
DHP Benchmark: Are LLMs Good NLG Evaluators?0
Dialect-robust Evaluation of Generated Text0
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG0
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking0
Show:102550
← PrevPage 3 of 8Next →

No leaderboard results yet.