SOTAVerified

Code Repair

Papers

Showing 139 of 39 papers

TitleStatusHype
AutoCoder: Enhancing Code Large Language Model with AIEV-InstructCode4
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem SolvingCode3
OctoPack: Instruction Tuning Code Large Language ModelsCode3
Guiding Language Models of Code with Global Context using MonitorsCode2
SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code AgentsCode2
Learning Performance-Improving Code EditsCode1
Break-It-Fix-It: Unsupervised Learning for Program RepairCode1
COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data SynthesisCode1
MACER: A Modular Framework for Accelerated Compilation Error RepairCode1
INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of RepairCode1
Fortran2CPP: Automating Fortran-to-C++ Translation using LLMs via Multi-Turn Dialogue and Dual-Agent IntegrationCode1
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and GenerationCode1
Enhanced Automated Code Vulnerability Repair using Large Language Models0
Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation0
Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs0
Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar0
Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar0
How Accurately Do Large Language Models Understand Code?0
Inferring Javascript types using Graph Neural Networks0
Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval0
Jointly Learning to Repair Code and Generate Commit Message0
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks0
LLM-Aided Efficient Hardware Design Automation0
Poison Attack and Defense on Deep Source Code Processing Models0
Semantic Code Repair using Neuro-Symbolic Transformation Networks0
RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts0
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair0
AuPair: Golden Example Pairs for Code Repair0
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair0
Break it - Message it - Fix it : Learning to Repair Python Programs using Error Messages without Labelled Data0
Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents0
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks0
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff0
Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models0
CrashFixer: A crash resolution agent for the Linux kernel0
DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models0
Investigating the Transferability of Code Repair for Low-Resource Programming Languages0
Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug ErrorsCode0
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code RepairCode0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NSEditAccuracy (medium)13.87Unverified
2CodeBERTAccuracy (medium)5.2Unverified