SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 264 papers

Title	Date	Tasks	Status	Hype	Score
Dynamic Scaling of Unit Tests for Code Reward Modeling	Jan 2, 2025	Code GenerationHumanEval	—Unverified	0	0
Structured Chain-of-Thought Prompting for Code Generation	May 11, 2023	Code GenerationHumanEval	—Unverified	0	0
Enhancing LLM-Based Code Generation with Complexity Metrics: A Feedback-Driven Approach	May 29, 2025	Code GenerationHumanEval	—Unverified	0	0
Evaluating Large Language Models for Code Review	May 26, 2025	HumanEval	—Unverified	0	0
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?	Mar 25, 2024	HumanEval	—Unverified	0	0
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation	Apr 1, 2024	Code GenerationHallucination	—Unverified	0	0
Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree	Dec 17, 2024	GSM8KHumanEval	—Unverified	0	0
From Output to Evaluation: Does Raw Instruction-Tuned Code LLMs Output Suffice for Fill-in-the-Middle Code Generation?	May 24, 2025	Code GenerationHumanEval	—Unverified	0	0
Fully Autonomous Programming using Iterative Multi-Agent Debugging with Large Language Models	Mar 10, 2025	HumanEvalProgram Synthesis	—Unverified	0	0
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks	Oct 15, 2024	HumanEvalLanguage Modelling	—Unverified	0	0

Show:10 25 50

← PrevPage 19 of 27Next →

No leaderboard results yet.