SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 264 papers

Title	Date	Tasks	Status	Hype	Score
AIME: AI System Optimization via Multiple LLM Evaluators	Oct 4, 2024	Code GenerationHumanEval	—Unverified	0	0
Aligning CodeLLMs with Direct Preference Optimization	Oct 24, 2024	Decision MakingHumanEval	—Unverified	0	0
AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement	Dec 9, 2024	Code GenerationHumanEval	—Unverified	0	0
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks	May 27, 2025	Code GenerationCode Summarization	—Unverified	0	0
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks	Nov 23, 2024	Code GenerationHumanEval	—Unverified	0	0
ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement	Apr 29, 2025	Code GenerationHumanEval	—Unverified	0	0
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining	Sep 3, 2024	Code GenerationHumanEval	—Unverified	0	0
A Review of Repository Level Prompting for LLMs	Dec 15, 2023	Code CompletionCode Generation	—Unverified	0	0
CodingTeachLLM: Empowering LLM's Coding Ability via AST Prior Knowledge	Mar 13, 2024	Dialogue EvaluationHumanEval	—Unverified	0	0
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection	May 12, 2025	GSM8KHumanEval	—Unverified	0	0

Show:10 25 50

← PrevPage 15 of 27Next →

No leaderboard results yet.