SOTAVerified|Agents Browse Leaderboard About

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 71 papers

Title	Date	Tasks	Status	Hype
Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation	Sep 14, 2021	nlg evaluationStyle Transfer	CodeCode Available	1
Long-Form Information Alignment Evaluation Beyond Atomic Facts	May 21, 2025	Formnlg evaluation	CodeCode Available	0
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts	Apr 29, 2025	AllDiversity	—Unverified	0
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?	Apr 10, 2025	Machine Translationnlg evaluation	—Unverified	0
OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs	Mar 14, 2025	nlg evaluation	CodeCode Available	0
Exploring the Multilingual NLG Evaluation Abilities of LLM-Based Evaluators	Mar 6, 2025	nlg evaluation	—Unverified	0
SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text	Nov 25, 2024	Language ModelingLanguage Modelling	—Unverified	0
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking	Nov 8, 2024	Fact Checkingnlg evaluation	—Unverified	0
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation	Oct 22, 2024	nlg evaluation	CodeCode Available	0
Large Language Models Are Active Critics in NLG Evaluation	Oct 14, 2024	nlg evaluationPrompt Engineering	—Unverified	0

Show:10 25 50

← PrevPage 2 of 8Next →

No leaderboard results yet.