SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 91–100 of 264 papers

Title	Date	Tasks	Status	Hype
Better & Faster Large Language Models via Multi-token Prediction	Apr 30, 2024	HumanEvalmbpp	CodeCode Available	1
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models	Feb 24, 2024	HumanEvalMemorization	CodeCode Available	1
How Efficient is LLM-Generated Code? A Rigorous & High-Standard Benchmark	Jun 10, 2024	HumanEvalProgram Synthesis	CodeCode Available	1
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models	Feb 23, 2025	Code GenerationHumanEval	CodeCode Available	1
ContraCLM: Contrastive Learning For Causal Language Model	Oct 3, 2022	Code GenerationCode Search	CodeCode Available	1
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation	May 27, 2024	Code GenerationHumanEval	CodeCode Available	1
ANPL: Towards Natural Programming with Interactive Decomposition	May 29, 2023	ARCCode Generation	CodeCode Available	1
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair	Dec 25, 2023	HumanEvalparameter-efficient fine-tuning	CodeCode Available	1
How to Select Datapoints for Efficient Human Evaluation of NLG Models?	Jan 30, 2025	HumanEvalMachine Translation	CodeCode Available	1
Instruction Tuning With Loss Over Instructions	May 23, 2024	HumanEvalMMLU	CodeCode Available	1

Show:10 25 50

← PrevPage 10 of 27Next →

No leaderboard results yet.