SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–110 of 264 papers

Title	Date	Tasks	Status	Hype
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks	Oct 14, 2024	FairnessGSM8K	CodeCode Available	0
KV Prediction for Improved Time to First Token	Oct 10, 2024	Code CompletionCPU	—Unverified	0
Context-Augmented Code Generation Using Programming Knowledge Graphs	Oct 9, 2024	Code GenerationHumanEval	—Unverified	0
AIME: AI System Optimization via Multiple LLM Evaluators	Oct 4, 2024	Code GenerationHumanEval	—Unverified	0
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis	Oct 3, 2024	HumanEvalSynthetic Data Generation	CodeCode Available	1
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging	Oct 2, 2024	Auto DebuggingBug fixing	CodeCode Available	2
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance	Oct 2, 2024	Code GenerationHumanEval	CodeCode Available	0
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation	Oct 1, 2024	Code GenerationHumanEval	CodeCode Available	0
Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity	Sep 24, 2024	Code GenerationContrastive Learning	—Unverified	0
Training Language Models to Self-Correct via Reinforcement Learning	Sep 19, 2024	HumanEvalMath	CodeCode Available	2

Show:10 25 50

← PrevPage 11 of 27Next →

No leaderboard results yet.