SOTAVerified

Program Repair

Task of teaching ML models to modify an existing program to fix a bug in a given code.

Papers

Showing 150 of 132 papers

TitleStatusHype
AutoCodeRover: Autonomous Program ImprovementCode7
Agentless: Demystifying LLM-based Software Engineering AgentsCode7
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at ScaleCode3
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
RepairAgent: An Autonomous, LLM-Based Agent for Program RepairCode2
CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program RepairCode1
o3-mini vs DeepSeek-R1: Which One is Safer?Code1
CURE: Code-Aware Neural Machine Translation for Automatic Program RepairCode1
TFix: Learning to Fix Coding Errors with a Text-to-Text TransformerCode1
Global Relational Models of Source CodeCode1
Neural Program Repair by Jointly Learning to Localize and RepairCode1
CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph SearchingCode1
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program RepairCode1
A Syntax-Guided Edit Decoder for Neural Program RepairCode1
Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program RepairCode1
KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program RepairCode1
How Effective Are Neural Networks for Fixing Security VulnerabilitiesCode1
Planning-Driven Programming: A Large Language Model Programming WorkflowCode1
SemCoder: Training Code Language Models with Comprehensive Semantics ReasoningCode1
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and RetrievalCode1
Enhancing Genetic Improvement Mutations Using Large Language ModelsCode1
Unified Pre-training for Program Understanding and GenerationCode1
Aligning the Objective of LLM-based Program RepairCode1
Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program RepairCode1
Break-It-Fix-It: Unsupervised Learning for Program RepairCode1
Graph-based, Self-Supervised Program Repair from Diagnostic FeedbackCode1
RepairBench: Leaderboard of Frontier Models for Program RepairCode1
Conversational Automated Program Repair0
ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair0
Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code0
A Study of Vulnerability Repair in JavaScript Programs with Large Language Models0
To Err is Machine: Vulnerability Detection Challenges LLM Reasoning0
Fully Autonomous Programming with Large Language Models0
Agentic Bug Reproduction for Effective Automated Program Repair at Google0
Enhancing Automated Program Repair with Solution Design0
BigIssue: A Realistic Bug Localization Benchmark0
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair0
Generating Bug-Fixes Using Pretrained Transformers0
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks0
Automatic Programming: Large Language Models and Beyond0
AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations0
Fairness-guided SMT-based Rectification of Decision Trees and Random Forests0
Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems0
Dynamic Neural Program Embeddings for Program Repair0
Enabling Automatic Repair of Source Code Vulnerabilities Using Data-Driven Methods0
ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair0
Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT50
Automated Program Repair: Emerging trends pose and expose problems for benchmarks0
Evaluating Agent-based Program Repair at Google0
An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DrRepair + BIFIAverage Success Rate71.7Unverified
2DrRepairAverage Success Rate68.2Unverified
3SampleFixAverage Success Rate45.3Unverified
4RLAssistAverage Success Rate26.6Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer + BIFIAccuracy (%)90.5Unverified
2TransformerAccuracy (%)62Unverified
#ModelMetricClaimedVerifiedStatus
1MGDebugger (DeepSeek-Coder-V2-Lite)Pass@197.6Unverified
#ModelMetricClaimedVerifiedStatus
1TFixError Removal678Unverified