SOTAVerified|Agents Browse Leaderboard About

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 71 papers

Title	Date	Tasks	Status	Hype	Score
Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement Theory	May 24, 2023	nlg evaluationText Generation	CodeCode Available	1	5
Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling	Jun 12, 2024	nlg evaluation	CodeCode Available	0	5
Long-Form Information Alignment Evaluation Beyond Atomic Facts	May 21, 2025	Formnlg evaluation	CodeCode Available	0	5
A Study of Automatic Metrics for the Evaluation of Natural Language Explanations	Mar 15, 2021	nlg evaluationText Generation	CodeCode Available	0	5
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering	Jul 13, 2023	Dialogue Generationnlg evaluation	CodeCode Available	0	5
Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation	Jun 12, 2024	nlg evaluationText Generation	CodeCode Available	0	5
CLSE: Corpus of Linguistically Significant Entities	Nov 4, 2022	nlg evaluationText Generation	CodeCode Available	0	5
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation Metrics	Sep 20, 2022	CPUGPU	CodeCode Available	0	5
DEBATE: Devil's Advocate-Based Assessment and Text Evaluation	May 16, 2024	nlg evaluationText Generation	CodeCode Available	0	5
Are LLM-based Evaluators Confusing NLG Quality Criteria?	Feb 19, 2024	nlg evaluation	CodeCode Available	0	5

Show:10 25 50

← PrevPage 2 of 8Next →

No leaderboard results yet.