SOTAVerified

HumanEval

Papers

Showing 76100 of 264 papers

TitleStatusHype
Addressing Data Leakage in HumanEval Using Combinatorial Test Design0
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect VerifiersCode0
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks0
Planning-Driven Programming: A Large Language Model Programming WorkflowCode1
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs0
PerfCodeGen: Improving Performance of LLM Generated Code with Execution FeedbackCode1
VALTEST: Automated Validation of Language Model Generated Test Cases0
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models0
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models0
InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code TranslationCode0
SelfCodeAlign: Self-Alignment for Code GenerationCode3
Demo-Craft: Using In-Context Learning to Improve Code Generation in Large Language Models0
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'Code1
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization systemCode0
Aligning CodeLLMs with Direct Preference Optimization0
Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment0
MojoBench: Language Modeling and Benchmarks for Mojo0
Scattered Forest Search: Smarter Code Space Exploration with LLMs0
Self-Evolving Multi-Agent Collaboration Networks for Software Development0
Semantic-guided Search for Efficient Program Repair with Large Language Models0
Self-Explained Keywords Empower Large Language Models for Code Generation0
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code GenerationCode0
CELI: Controller-Embedded Language Model Interactions0
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding TasksCode1
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks0
Show:102550
← PrevPage 4 of 11Next →

No leaderboard results yet.