SOTAVerified|Agents Browse Leaderboard About

nlg evaluation

Evaluate the generated text by NLG (Natural Language Generation) systems, like large language models

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 71 papers

Title	Date	Tasks	Status	Hype
The Pitfalls of Defining Hallucination	Jan 15, 2024	Hallucinationnlg evaluation	—Unverified	0
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation	Dec 16, 2023	nlg evaluationResponse Generation	—Unverified	0
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects	Nov 15, 2023	Dialogue GenerationLanguage Modelling	—Unverified	0
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation	Aug 6, 2023	Diversitynlg evaluation	CodeCode Available	0
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models	Jul 15, 2023	nlg evaluationResponse Generation	CodeCode Available	0
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering	Jul 13, 2023	Dialogue Generationnlg evaluation	CodeCode Available	0
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap	Jun 1, 2023	Explainable Artificial Intelligence (XAI)nlg evaluation	—Unverified	0
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG	May 24, 2023	Dialogue GenerationDiversity	—Unverified	0
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References	May 24, 2023	AllMachine Translation	CodeCode Available	0
Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions	Mar 7, 2023	nlg evaluationRepresentation Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 4 of 8Next →

No leaderboard results yet.