SOTAVerified|Agents Browse Leaderboard About Blog

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 264 papers

Title	Date	Tasks	Status	Hype
Turning the Tide: Repository-based Code Reflection	Jul 14, 2025	Code GenerationDiversity	—Unverified	0
Rethinking Verification for LLM Code Generation: From Generation to Testing	Jul 9, 2025	Code GenerationHumanEval	CodeCode Available	1
any4: Learned 4-bit Numeric Representation for LLMs	Jul 7, 2025	GPUGSM8K	CodeCode Available	2
SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization	Jun 25, 2025	Code GenerationHumanEval	—Unverified	0
Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models	Jun 23, 2025	Code CompletionGSM8K	—Unverified	0
AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System Need	Jun 18, 2025	GSM8KHumanEval	CodeCode Available	0
LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing	Jun 17, 2025	ARCCoLA	—Unverified	0
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees	Jun 17, 2025	Code TranslationHumanEval	—Unverified	0
Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation	Jun 9, 2025	GSM8KHumanEval	—Unverified	0
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation	May 30, 2025	Code GenerationHumanEval	—Unverified	0

Show:10 25 50

← PrevPage 1 of 27Next →

No leaderboard results yet.