SOTAVerified

Bug fixing

Papers

Showing 125 of 62 papers

TitleStatusHype
CoreCodeBench: A Configurable Multi-Scenario Repository-Level BenchmarkCode1
The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries0
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software DevelopmentCode2
LongCodeBench: Evaluating Coding LLMs at 1M Context Windows0
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries0
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction0
On Simulation-Guided LLM-based Code Generation for Safe Autonomous Driving Software0
Less is More: Adaptive Program Repair with Bug Localization and Preference LearningCode0
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol0
Empirical evaluation of LLMs in predicting fixes of Configuration bugs in Smart Home System0
Repository-level Code Search with Neural Retrieval MethodsCode0
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code GenerationCode0
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and RerankingCode2
An Empirical Study on LLM-based Agents for Automated Bug Fixing0
A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation0
PDC & DM-SFT: A Road for LLM SQL Bug-Fix Enhancing0
MetRex: A Benchmark for Verilog Code Metric Reasoning Using LLMsCode1
Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers0
Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks0
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical DebuggingCode2
MarsCode Agent: AI-native Automated Bug Fixing0
Leveraging Large Language Models for Enhancing the Understandability of Generated Unit TestsCode1
Patched RTC: evaluating LLMs for diverse software development tasksCode0
CodeR: Issue Resolving with Multi-Agent and Task GraphsCode2
SWE-agent: Agent-Computer Interfaces Enable Automated Software EngineeringCode11
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.