SOTAVerified|Agents Browse Leaderboard About

HumanEval

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 264 papers

Title	Date	Tasks	Status	Hype	Score
AutoTest: Evolutionary Code Solution Selection with Test Cases	Aug 22, 2024	Code GenerationHumanEval	—Unverified	0	0
BASS: Batched Attention-optimized Speculative Sampling	Apr 24, 2024	GPUHumanEval	—Unverified	0	0
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol	Mar 7, 2025	BenchmarkingBug fixing	—Unverified	0	0
PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs	Jan 8, 2024	Code GenerationDiversity	—Unverified	0	0
Brevity is the soul of wit: Pruning long files for code generation	Jun 29, 2024	Code GenerationHumanEval	—Unverified	0	0
Bridging Code Semantic and LLMs: Semantic Chain-of-Thought Prompting for Code Generation	Oct 16, 2023	Code GenerationHumanEval	—Unverified	0	0
Can LLMs Enable Verification in Mainstream Programming?	Mar 18, 2025	Code GenerationHumanEval	—Unverified	0	0
CELI: Controller-Embedded Language Model Interactions	Oct 18, 2024	ArticlesCode Generation	—Unverified	0	0
CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation	Aug 17, 2023	Code GenerationFew-Shot Learning	—Unverified	0	0
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model	Oct 10, 2023	Code GenerationCode Translation	—Unverified	0	0
CodeMirage: Hallucinations in Code Generated by Large Language Models	Aug 14, 2024	Code GenerationHallucination	—Unverified	0	0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts	May 8, 2025	Code CompletionCode Generation	—Unverified	0	0
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency	Jun 18, 2024	HumanEvalmbpp	—Unverified	0	0
CodeShell Technical Report	Mar 23, 2024	8kHumanEval	—Unverified	0	0
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models	Nov 7, 2024	Code GenerationDecision Making	—Unverified	0	0
Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting	Aug 18, 2024	HumanEvalMathematical Reasoning	—Unverified	0	0
Context-Augmented Code Generation Using Programming Knowledge Graphs	Oct 9, 2024	Code GenerationHumanEval	—Unverified	0	0
CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks	Sep 13, 2024	ARCCode Generation	—Unverified	0	0
CREST: Effectively Compacting a Datastore For Retrieval-Based Speculative Decoding	Aug 8, 2024	HumanEvalRetrieval	—Unverified	0	0
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution	Aug 23, 2024	Code GenerationHumanEval	—Unverified	0	0
Dafny as Verification-Aware Intermediate Language for Code Generation	Jan 10, 2025	Code GenerationHumanEval	—Unverified	0	0
Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data	Dec 5, 2023	Code GenerationHumanEval	—Unverified	0	0
Demo-Craft: Using In-Context Learning to Improve Code Generation in Large Language Models	Oct 30, 2024	Code GenerationHumanEval	—Unverified	0	0
Discrete Flow Matching	Jul 22, 2024	HumanEvalmbpp	—Unverified	0	0
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation	May 30, 2024	Code GenerationHumanEval	—Unverified	0	0

Show:10 25 50

← PrevPage 7 of 11Next →

No leaderboard results yet.