| Agentless: Demystifying LLM-based Software Engineering Agents | Jul 1, 2024 | Program Repair | CodeCode Available | 7 |
| AutoCodeRover: Autonomous Program Improvement | Apr 8, 2024 | Bug fixingCode Search | CodeCode Available | 7 |
| HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale | Sep 9, 2024 | Code GenerationFault localization | CodeCode Available | 3 |
| From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging | Oct 2, 2024 | Auto DebuggingBug fixing | CodeCode Available | 2 |
| RepairAgent: An Autonomous, LLM-Based Agent for Program Repair | Mar 25, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching | Mar 28, 2025 | Program Repair | CodeCode Available | 1 |
| o3-mini vs DeepSeek-R1: Which One is Safer? | Jan 30, 2025 | Code GenerationProgram Repair | CodeCode Available | 1 |
| Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair | Dec 5, 2024 | Fault localizationProgram Repair | CodeCode Available | 1 |
| Planning-Driven Programming: A Large Language Model Programming Workflow | Nov 21, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| RepairBench: Leaderboard of Frontier Models for Program Repair | Sep 27, 2024 | Program Repair | CodeCode Available | 1 |
| SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning | Jun 3, 2024 | Code CompletionCode Generation | CodeCode Available | 1 |
| Aligning the Objective of LLM-based Program Repair | Apr 13, 2024 | Fault localizationProgram Repair | CodeCode Available | 1 |
| RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair | Dec 25, 2023 | HumanEvalparameter-efficient fine-tuning | CodeCode Available | 1 |
| Enhancing Genetic Improvement Mutations Using Large Language Models | Oct 18, 2023 | Program Repair | CodeCode Available | 1 |
| Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair | Sep 1, 2023 | Code GenerationProgram Repair | CodeCode Available | 1 |
| How Effective Are Neural Networks for Fixing Security Vulnerabilities | May 29, 2023 | Code CompletionProgram Repair | CodeCode Available | 1 |
| xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval | Mar 6, 2023 | Program RepairProgram Synthesis | CodeCode Available | 1 |
| KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair | Feb 3, 2023 | DecoderProgram Repair | CodeCode Available | 1 |
| TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer | Jul 18, 2021 | Code GenerationMulti-Task Learning | CodeCode Available | 1 |
| A Syntax-Guided Edit Decoder for Neural Program Repair | Jun 15, 2021 | Code CompletionCode Generation | CodeCode Available | 1 |
| Break-It-Fix-It: Unsupervised Learning for Program Repair | Jun 11, 2021 | C++ codeCode Repair | CodeCode Available | 1 |
| Unified Pre-training for Program Understanding and Generation | Mar 10, 2021 | Clone DetectionCode Generation | CodeCode Available | 1 |
| CURE: Code-Aware Neural Machine Translation for Automatic Program Repair | Feb 26, 2021 | Machine TranslationNMT | CodeCode Available | 1 |
| CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair | Jul 18, 2020 | Ensemble LearningMachine Translation | CodeCode Available | 1 |
| Graph-based, Self-Supervised Program Repair from Diagnostic Feedback | May 20, 2020 | Code GenerationDiagnostic | CodeCode Available | 1 |
| Global Relational Models of Source Code | May 1, 2020 | Inductive BiasProgram Repair | CodeCode Available | 1 |
| Neural Program Repair by Jointly Learning to Localize and Repair | Apr 3, 2019 | Program RepairVariable misuse | CodeCode Available | 1 |
| CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks | Jul 3, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| T^3: Multi-level Tree-based Automatic Program Repair with Large Language Models | Jun 26, 2025 | Program Repair | —Unverified | 0 |
| Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories | Jun 23, 2025 | Large Language ModelProgram Repair | —Unverified | 0 |
| Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems | Jun 20, 2025 | Program Repair | —Unverified | 0 |
| SemAgent: A Semantics Aware Program Repair Agent | Jun 19, 2025 | Program Repair | —Unverified | 0 |
| A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair | Jun 5, 2025 | Program RepairVulnerability Detection | —Unverified | 0 |
| An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks | May 27, 2025 | Code GenerationCode Summarization | —Unverified | 0 |
| Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces | May 23, 2025 | Program Repair | —Unverified | 0 |
| Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data | May 12, 2025 | Program RepairSynthetic Data Generation | —Unverified | 0 |
| Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs | May 7, 2025 | Program Repair | —Unverified | 0 |
| The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models | May 5, 2025 | HumanEvalProgram Repair | —Unverified | 0 |
| SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs | Apr 20, 2025 | Program Repair | —Unverified | 0 |
| Using ML filters to help automated vulnerability repairs: when it helps and when it doesn't | Apr 9, 2025 | Program RepairVulnerability Detection | —Unverified | 0 |
| Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing | Mar 20, 2025 | FairnessProgram Repair | —Unverified | 0 |
| Evaluating the Generalizability of LLMs in Automated Program Repair | Mar 12, 2025 | Program RepairPrompt Engineering | —Unverified | 0 |
| Less is More: Adaptive Program Repair with Bug Localization and Preference Learning | Mar 9, 2025 | Bug fixingProgram Repair | CodeCode Available | 0 |
| Where's the Bug? Attention Probing for Scalable Fault Localization | Feb 19, 2025 | Fault localizationProgram Repair | —Unverified | 0 |
| LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks | Feb 10, 2025 | Code GenerationProgram Repair | —Unverified | 0 |
| Agentic Bug Reproduction for Effective Automated Program Repair at Google | Feb 3, 2025 | Large Language ModelProgram Repair | —Unverified | 0 |
| Evaluating Agent-based Program Repair at Google | Jan 13, 2025 | Code GenerationProgram Repair | —Unverified | 0 |
| The Impact of Input Order Bias on Large Language Models for Software Fault Localization | Dec 25, 2024 | Fault localizationMemorization | —Unverified | 0 |
| Counterexample Guided Program Repair Using Zero-Shot Learning and MaxSAT-based Fault Localization | Dec 19, 2024 | Fault localizationProgram Repair | —Unverified | 0 |
| A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation | Nov 12, 2024 | Bug fixingCode Generation | —Unverified | 0 |