SOTAVerified

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Showing 2130 of 71 papers

TitleStatusHype
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation MetricsCode0
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question AnsweringCode0
Are LLM-based Evaluators Confusing NLG Quality Criteria?Code0
Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and UnderstandingCode0
Analyzing and Evaluating Correlation Measures in NLG Meta-EvaluationCode0
Describe me an Aucklet: Generating Grounded Perceptual Category DescriptionsCode0
Long-Form Information Alignment Evaluation Beyond Atomic FactsCode0
OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMsCode0
Better than Random: Reliable NLG Human Evaluation with Constrained Active SamplingCode0
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG EvaluationCode0
Show:102550
← PrevPage 3 of 8Next →

No leaderboard results yet.