SOTAVerified

Program Repair

Task of teaching ML models to modify an existing program to fix a bug in a given code.

Papers

Showing 150 of 132 papers

TitleStatusHype
Agentless: Demystifying LLM-based Software Engineering AgentsCode7
AutoCodeRover: Autonomous Program ImprovementCode7
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at ScaleCode3
RepairAgent: An Autonomous, LLM-Based Agent for Program RepairCode2
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
A Syntax-Guided Edit Decoder for Neural Program RepairCode1
RepairBench: Leaderboard of Frontier Models for Program RepairCode1
Global Relational Models of Source CodeCode1
Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program RepairCode1
o3-mini vs DeepSeek-R1: Which One is Safer?Code1
CURE: Code-Aware Neural Machine Translation for Automatic Program RepairCode1
Break-It-Fix-It: Unsupervised Learning for Program RepairCode1
CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program RepairCode1
SemCoder: Training Code Language Models with Comprehensive Semantics ReasoningCode1
How Effective Are Neural Networks for Fixing Security VulnerabilitiesCode1
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and RetrievalCode1
Unified Pre-training for Program Understanding and GenerationCode1
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program RepairCode1
Enhancing Genetic Improvement Mutations Using Large Language ModelsCode1
Neural Program Repair by Jointly Learning to Localize and RepairCode1
KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program RepairCode1
Planning-Driven Programming: A Large Language Model Programming WorkflowCode1
CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph SearchingCode1
Aligning the Objective of LLM-based Program RepairCode1
Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program RepairCode1
TFix: Learning to Fix Coding Errors with a Text-to-Text TransformerCode1
Graph-based, Self-Supervised Program Repair from Diagnostic FeedbackCode1
Arachne: Search Based Repair of Deep Neural NetworksCode0
Robot Action Selection Learning via Layered Dimension Informed Program SynthesisCode0
SequenceR: Sequence-to-Sequence Learning for End-to-End Program RepairCode0
Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugsCode0
Assessing the Effectiveness of Syntactic Structure to Learn Code Edit RepresentationsCode0
Breaking the Silence: the Threats of Using LLMs in Software EngineeringCode0
RePair: Automated Program Repair with Process-based FeedbackCode0
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language ModelsCode0
Teaching Large Language Models to Self-DebugCode0
Out of Context: How important is Local Context in Neural Program Repair?Code0
Dynamic Neural Program Embedding for Program RepairCode0
Patching as Translation: the Data and the MetaphorCode0
Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic ReasoningCode0
InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming AssignmentsCode0
Deep Reinforcement Learning for Programming Language CorrectionCode0
DeepFix: Fixing Common C Language Errors by Deep LearningCode0
Less is More: Adaptive Program Repair with Bug Localization and Preference LearningCode0
Benchmarking Educational Program RepairCode0
Learning to Execute Programs with Instruction Pointer Attention Graph Neural NetworksCode0
Graph Neural Networks For Mapping Variables Between Programs -- Extended VersionCode0
C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming AssignmentsCode0
Human-In-The-Loop Automatic Program RepairCode0
Exploring Plausible Patches Using Source Code Embeddings in JavaScriptCode0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DrRepair + BIFIAverage Success Rate71.7Unverified
2DrRepairAverage Success Rate68.2Unverified
3SampleFixAverage Success Rate45.3Unverified
4RLAssistAverage Success Rate26.6Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer + BIFIAccuracy (%)90.5Unverified
2TransformerAccuracy (%)62Unverified
#ModelMetricClaimedVerifiedStatus
1MGDebugger (DeepSeek-Coder-V2-Lite)Pass@197.6Unverified
#ModelMetricClaimedVerifiedStatus
1TFixError Removal678Unverified