SOTAVerified|Agents Browse Leaderboard About

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 71 papers

Title	Date	Tasks	Status
NLG-Metricverse: An End-to-End Library for Evaluating Natural Language Generation	Oct 1, 2022	Managementnlg evaluation	—Unverified
Evaluation of Text Generation: A Survey	Jun 26, 2020	nlg evaluationSurvey	—Unverified
Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation	Dec 1, 2020	nlg evaluationPosition	—Unverified
Exploring the Multilingual NLG Evaluation Abilities of LLM-Based Evaluators	Mar 6, 2025	nlg evaluation	—Unverified
The Authenticity Gap in Human Evaluation	May 24, 2022	nlg evaluationSingle Particle Analysis	—Unverified
ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation	Jun 10, 2021	nlg evaluationText Generation	—Unverified
ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation	Dec 17, 2021	nlg evaluationText Generation	—Unverified
Language Model Augmented Relevance Score	Aug 19, 2021	Language ModelingLanguage Modelling	—Unverified
Large Language Models Are Active Critics in NLG Evaluation	Oct 14, 2024	nlg evaluationPrompt Engineering	—Unverified
LLM-based NLG Evaluation: Current Status and Challenges	Feb 2, 2024	nlg evaluationText Generation	—Unverified
A Survey of Natural Language Generation	Dec 22, 2021	Data-to-Text GenerationDeep Learning	—Unverified
MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation	Jul 24, 2021	Diversitynlg evaluation	—Unverified
A Tutorial on Evaluation Metrics used in Natural Language Generation	Jun 1, 2021	nlg evaluationText Generation	—Unverified
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking	Nov 8, 2024	Fact Checkingnlg evaluation	—Unverified
Agreement is overrated: A plea for correlation to assess human evaluation reliability	Oct 1, 2019	nlg evaluation	—Unverified
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts	Apr 29, 2025	AllDiversity	—Unverified
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text	Jun 30, 2021	AllArticles	—Unverified
All That's `Human' Is Not Gold: Evaluating Human Evaluation of Generated Text	Aug 1, 2021	AllArticles	—Unverified
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text	Feb 14, 2022	nlg evaluationText Generation	—Unverified
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap	Jun 1, 2023	Explainable Artificial Intelligence (XAI)nlg evaluation	—Unverified
SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text	Nov 25, 2024	Language ModelingLanguage Modelling	—Unverified
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation	Mar 29, 2017	nlg evaluationSurvey	—Unverified
The Pitfalls of Defining Hallucination	Jan 15, 2024	Hallucinationnlg evaluation	—Unverified
The use of rating and Likert scales in Natural Language Generation human evaluation tasks: A review and some recommendations	Oct 1, 2019	nlg evaluationText Generation	—Unverified
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models	Jul 15, 2023	nlg evaluationResponse Generation	CodeCode Available

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.