SOTAVerified

HumanEval

Papers

Showing 101125 of 264 papers

TitleStatusHype
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
KV Prediction for Improved Time to First TokenCode0
Context-Augmented Code Generation Using Programming Knowledge Graphs0
AIME: AI System Optimization via Multiple LLM Evaluators0
Training Language Models on Synthetic Edit Sequences Improves Code SynthesisCode1
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation GuidanceCode0
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code GenerationCode0
Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity0
Training Language Models to Self-Correct via Reinforcement LearningCode2
GRIN: GRadient-INformed MoE0
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation0
Measuring the Influence of Incorrect Code on Test GenerationCode0
CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks0
Policy Filtration in RLHF to Fine-Tune LLM for Code GenerationCode1
USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding0
Multi-Programming Language Ensemble for Code Generation in Large Language ModelCode0
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality DataCode1
Planning In Natural Language Improves LLM Search For Code GenerationCode1
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining0
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation0
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution0
AutoTest: Evolutionary Code Solution Selection with Test Cases0
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs0
Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting0
Show:102550
← PrevPage 5 of 11Next →

No leaderboard results yet.