SOTAVerified

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Showing 3140 of 71 papers

TitleStatusHype
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying ReferencesCode0
Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement TheoryCode1
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference ChecklistCode3
G-Eval: NLG Evaluation using GPT-4 with Better Human AlignmentCode1
Is ChatGPT a Good NLG Evaluator? A Preliminary StudyCode1
Describe me an Aucklet: Generating Grounded Perceptual Category DescriptionsCode0
CLSE: Corpus of Linguistically Significant EntitiesCode0
Dialect-robust Evaluation of Generated Text0
Towards a Unified Multi-Dimensional Evaluator for Text GenerationCode2
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error SynthesisCode1
Show:102550
← PrevPage 4 of 8Next →

No leaderboard results yet.