SOTAVerified

Program Repair

Task of teaching ML models to modify an existing program to fix a bug in a given code.

Papers

Showing 51100 of 132 papers

TitleStatusHype
MdEval: Massively Multilingual Code Debugging0
Semantic-guided Search for Efficient Program Repair with Large Language Models0
Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code0
In-Context Code-Text Learning for Bimodal Software Engineering0
Exploring the Potential of Conversational Test Suite Based Program Repair on SWE-bench0
Can GPT-O1 Kill All Bugs? An Evaluation of GPT-Family LLMs on QuixBugsCode0
Enhancing Automated Program Repair with Solution Design0
RePair: Automated Program Repair with Process-based FeedbackCode0
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair0
SpecRover: Code Intent Extraction via LLMs0
Automated C/C++ Program Repair for High-Level Synthesis via Large Language Models0
NARRepair: Non-Autoregressive Code Generation Model for Automatic Program Repair0
Automated Program Repair: Emerging trends pose and expose problems for benchmarks0
Benchmarking Educational Program RepairCode0
Automatic Programming: Large Language Models and Beyond0
NExT: Teaching Large Language Models to Reason about Code Execution0
Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments0
To Err is Machine: Vulnerability Detection Challenges LLM Reasoning0
A Study of Vulnerability Repair in JavaScript Programs with Large Language Models0
Towards Reliable Evaluation of Neural Program Repair with Natural Robustness TestingCode0
DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models0
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language ModelsCode0
Breaking the Silence: the Threats of Using LLMs in Software EngineeringCode0
Out of Context: How important is Local Context in Neural Program Repair?Code0
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning0
ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair0
Automated Bug Generation in the era of Large Language Models0
Program Repair with Minimal Edits Using CodeT50
Frustrated with Code Quality Issues? LLMs can Help!0
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair0
Graph Neural Networks For Mapping Variables Between Programs -- Extended VersionCode0
An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code0
Better patching using LLM prompting, via Self-Consistency0
Is ChatGPT the Ultimate Programming Assistant -- How far is it?0
Fully Autonomous Programming with Large Language Models0
Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering0
Teaching Large Language Models to Self-DebugCode0
RunBugRun -- An Executable Dataset for Automated Program Repair0
Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT0
Revisiting the Plastic Surgery Hypothesis via Large Language Models0
Conversational Automated Program Repair0
Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic ReasoningCode0
Improving Automated Program Repair with Domain Adaptation0
Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT50
Repairing Bugs in Python Assignments Using Large Language Models0
Repair Is Nearly Generation: Multilingual Program Repair with LLMs0
BigIssue: A Realistic Bug Localization Benchmark0
InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming AssignmentsCode0
C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming AssignmentsCode0
Leveraging Causal Inference for Explainable Automatic Program Repair0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DrRepair + BIFIAverage Success Rate71.7Unverified
2DrRepairAverage Success Rate68.2Unverified
3SampleFixAverage Success Rate45.3Unverified
4RLAssistAverage Success Rate26.6Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer + BIFIAccuracy (%)90.5Unverified
2TransformerAccuracy (%)62Unverified
#ModelMetricClaimedVerifiedStatus
1MGDebugger (DeepSeek-Coder-V2-Lite)Pass@197.6Unverified
#ModelMetricClaimedVerifiedStatus
1TFixError Removal678Unverified