SOTAVerified

mbpp

Papers

Showing 51100 of 129 papers

TitleStatusHype
Context-Augmented Code Generation Using Programming Knowledge Graphs0
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation GuidanceCode0
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code GenerationCode0
Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity0
Policy Filtration in RLHF to Fine-Tune LLM for Code GenerationCode1
USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding0
Planning In Natural Language Improves LLM Search For Code GenerationCode1
Prompt Baking0
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer0
CodeMirage: Hallucinations in Code Generated by Large Language Models0
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph DatabasesCode7
Discrete Flow Matching0
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-InstructCode1
Brevity is the soul of wit: Pruning long files for code generation0
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning0
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency0
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based SamplingCode1
Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages0
PLUM: Improving Code LMs with Execution-Guided On-Policy Preference Learning Driven By Synthetic Test Cases0
A Survey on Large Language Models for Code GenerationCode2
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation0
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code GenerationCode1
Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting0
EffiLearner: Enhancing Efficiency of Generated Code via Self-OptimizationCode1
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-ContrastCode1
Multiple-Choice Questions are Efficient and Robust LLM EvaluatorsCode1
MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code GenerationCode1
MapCoder: Multi-Agent Code Generation for Competitive Problem SolvingCode2
NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User PromptsCode2
Better & Faster Large Language Models via Multi-token PredictionCode1
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-ExpertsCode1
NExT: Teaching Large Language Models to Reason about Code Execution0
Comments as Natural Logic Pivots: Improve Code Generation via Comment PerspectiveCode0
CYCLE: Learning to Self-Refine the Code GenerationCode1
SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents0
Software Vulnerability and Functionality Assessment using LLMs0
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code0
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language ModelsCode1
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-stepCode4
OpenCodeInterpreter: Integrating Code Generation with Execution and RefinementCode5
Test-Driven Development for Code Generation0
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction TuningCode1
Unsupervised Evaluation of Code LLMs with Round-Trip CorrectnessCode1
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision0
Getting the most out of your tokenizer for pre-training and domain adaptationCode1
OOP: Object-Oriented Programming Evaluation Benchmark for Large Language ModelsCode1
PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs0
Instruction Fusion: Advancing Prompt Evolution through HybridizationCode0
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and OptimisationCode2
ComplexityNet: Increasing LLM Inference Efficiency by Learning Task Complexity0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.