SOTAVerified

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Showing 2130 of 71 papers

TitleStatusHype
The Pitfalls of Defining Hallucination0
Leveraging Large Language Models for NLG Evaluation: Advances and ChallengesCode1
LUNA: A Framework for Language Understanding and Naturalness AssessmentCode1
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation0
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects0
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG EvaluationCode0
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language ModelsCode0
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question AnsweringCode0
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap0
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG0
Show:102550
← PrevPage 3 of 8Next →

No leaderboard results yet.