SOTAVerified

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Showing 1120 of 71 papers

TitleStatusHype
Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement TheoryCode1
Better than Random: Reliable NLG Human Evaluation with Constrained Active SamplingCode0
Long-Form Information Alignment Evaluation Beyond Atomic FactsCode0
A Study of Automatic Metrics for the Evaluation of Natural Language ExplanationsCode0
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question AnsweringCode0
Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG EvaluationCode0
CLSE: Corpus of Linguistically Significant EntitiesCode0
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation MetricsCode0
DEBATE: Devil's Advocate-Based Assessment and Text EvaluationCode0
Are LLM-based Evaluators Confusing NLG Quality Criteria?Code0
Show:102550
← PrevPage 2 of 8Next →

No leaderboard results yet.