SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 211–220 of 264 papers

Title	Date	Tasks	Status	Hype	Score
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression	Sep 25, 2023	Code GenerationHumanEval	—Unverified	0	0
Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation	Apr 17, 2024	Code GenerationHumanEval	—Unverified	0	0
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants	Jul 12, 2024	HumanEval	—Unverified	0	0
USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding	Sep 9, 2024	Code GenerationHumanEval	—Unverified	0	0
Memorization or Interpolation ? Detecting LLM Memorization through Input Perturbation Analysis	May 5, 2025	ArticlesHumanEval	—Unverified	0	0
MojoBench: Language Modeling and Benchmarks for Mojo	Oct 23, 2024	Code GenerationHumanEval	—Unverified	0	0
Mutation-based Consistency Testing for Evaluating the Code Understanding Capability of LLMs	Jan 11, 2024	Code GenerationHumanEval	—Unverified	0	0
NExT: Teaching Large Language Models to Reason about Code Execution	Apr 23, 2024	HumanEvalmbpp	—Unverified	0	0
NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness	Jan 29, 2024	HumanEval	—Unverified	0	0
On the Limitations of Embedding Based Methods for Measuring Functional Correctness for Code Generation	Apr 26, 2024	Code GenerationHumanEval	—Unverified	0	0

Show:10 25 50

← PrevPage 22 of 27Next →

No leaderboard results yet.