| Context-Augmented Code Generation Using Programming Knowledge Graphs | Oct 9, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance | Oct 2, 2024 | Code GenerationHumanEval | CodeCode Available | 0 |
| AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation | Oct 1, 2024 | Code GenerationHumanEval | CodeCode Available | 0 |
| Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity | Sep 24, 2024 | Code GenerationContrastive Learning | —Unverified | 0 |
| Policy Filtration in RLHF to Fine-Tune LLM for Code Generation | Sep 11, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding | Sep 9, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| Planning In Natural Language Improves LLM Search For Code Generation | Sep 5, 2024 | Code GenerationDiversity | CodeCode Available | 1 |
| Prompt Baking | Sep 4, 2024 | ARCGSM8K | —Unverified | 0 |
| Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer | Aug 19, 2024 | Code GenerationCross-Lingual Transfer | —Unverified | 0 |
| CodeMirage: Hallucinations in Code Generated by Large Language Models | Aug 14, 2024 | Code GenerationHallucination | —Unverified | 0 |
| CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases | Aug 7, 2024 | HumanEvalmbpp | CodeCode Available | 7 |
| Discrete Flow Matching | Jul 22, 2024 | HumanEvalmbpp | —Unverified | 0 |
| InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct | Jul 8, 2024 | Code GenerationCode Summarization | CodeCode Available | 1 |
| Brevity is the soul of wit: Pruning long files for code generation | Jun 29, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning | Jun 20, 2024 | GSM8KHeuristic Search | —Unverified | 0 |
| Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency | Jun 18, 2024 | HumanEvalmbpp | —Unverified | 0 |
| DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling | Jun 17, 2024 | GSM8KMath | CodeCode Available | 1 |
| Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages | Jun 14, 2024 | Code Generationmbpp | —Unverified | 0 |
| PLUM: Improving Code LMs with Execution-Guided On-Policy Preference Learning Driven By Synthetic Test Cases | Jun 11, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| A Survey on Large Language Models for Code Generation | Jun 1, 2024 | Code GenerationHumanEval | CodeCode Available | 2 |
| Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation | May 30, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation | May 27, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting | May 25, 2024 | Contrastive Learningmbpp | —Unverified | 0 |
| EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization | May 24, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast | May 23, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 1 |
| Multiple-Choice Questions are Efficient and Robust LLM Evaluators | May 20, 2024 | GSM8KHumanEval | CodeCode Available | 1 |
| MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation | May 19, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| MapCoder: Multi-Agent Code Generation for Competitive Problem Solving | May 18, 2024 | Code GenerationHumanEval | CodeCode Available | 2 |
| NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts | May 7, 2024 | HumanEvalmbpp | CodeCode Available | 2 |
| Better & Faster Large Language Models via Multi-token Prediction | Apr 30, 2024 | HumanEvalmbpp | CodeCode Available | 1 |
| XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts | Apr 23, 2024 | HumanEvalmbpp | CodeCode Available | 1 |
| NExT: Teaching Large Language Models to Reason about Code Execution | Apr 23, 2024 | HumanEvalmbpp | —Unverified | 0 |
| Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective | Apr 11, 2024 | Code GenerationHumanEval | CodeCode Available | 0 |
| CYCLE: Learning to Self-Refine the Code Generation | Mar 27, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents | Mar 23, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| Software Vulnerability and Functionality Assessment using LLMs | Mar 13, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code | Mar 12, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models | Mar 11, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | Feb 25, 2024 | Code GenerationHumanEval | CodeCode Available | 4 |
| OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement | Feb 22, 2024 | Code GenerationHumanEval | CodeCode Available | 5 |
| Test-Driven Development for Code Generation | Feb 21, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning | Feb 14, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| Unsupervised Evaluation of Code LLMs with Round-Trip Correctness | Feb 13, 2024 | HumanEvalmbpp | CodeCode Available | 1 |
| Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision | Feb 5, 2024 | GSM8KMath | —Unverified | 0 |
| Getting the most out of your tokenizer for pre-training and domain adaptation | Feb 1, 2024 | Code GenerationDomain Adaptation | CodeCode Available | 1 |
| OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models | Jan 12, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs | Jan 8, 2024 | Code GenerationDiversity | —Unverified | 0 |
| Instruction Fusion: Advancing Prompt Evolution through Hybridization | Dec 25, 2023 | Code GenerationHumanEval | CodeCode Available | 0 |
| AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation | Dec 20, 2023 | Code GenerationHumanEval | CodeCode Available | 2 |
| ComplexityNet: Increasing LLM Inference Efficiency by Learning Task Complexity | Dec 12, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 |