SOTAVerified|Agents Browse Leaderboard About Blog

Code Repair

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–39 of 39 papers

Title	Date	Tasks	Status	Hype
Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents	May 30, 2025	BenchmarkingCode Repair	—Unverified	0
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks	Jul 14, 2025	BenchmarkingCode Generation	—Unverified	0
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff	May 26, 2024	Code RepairLanguage Modeling	—Unverified	0
Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models	Jan 13, 2024	Code GenerationCode Repair	—Unverified	0
CrashFixer: A crash resolution agent for the Linux kernel	Apr 29, 2025	Code Repair	—Unverified	0
DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models	Feb 19, 2024	Code RepairFew-Shot Learning	—Unverified	0
Investigating the Transferability of Code Repair for Low-Resource Programming Languages	Jun 21, 2024	Code GenerationCode Repair	—Unverified	0
Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors	Mar 28, 2025	BenchmarkingCode Generation	CodeCode Available	0
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair	Sep 19, 2024	Code GenerationCode Repair	CodeCode Available	0

Show:10 25 50

← PrevPage 4 of 4Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NSEdit	Accuracy (medium)	13.87	—	Unverified
2	CodeBERT	Accuracy (medium)	5.2	—	Unverified