| CoreCodeBench: A Configurable Multi-Scenario Repository-Level Benchmark | Jul 4, 2025 | Bug fixingCode Generation | CodeCode Available | 1 |
| The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries | Jun 14, 2025 | Bug fixingInference Optimization | —Unverified | 0 |
| SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development | May 22, 2025 | Bug fixingChatbot | CodeCode Available | 2 |
| LongCodeBench: Evaluating Coding LLMs at 1M Context Windows | May 12, 2025 | Bug fixing | —Unverified | 0 |
| APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries | Apr 27, 2025 | Automated Theorem ProvingBug fixing | —Unverified | 0 |
| VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction | Apr 27, 2025 | Bug fixing | —Unverified | 0 |
| On Simulation-Guided LLM-based Code Generation for Safe Autonomous Driving Software | Apr 2, 2025 | Autonomous DrivingBug fixing | —Unverified | 0 |
| Less is More: Adaptive Program Repair with Bug Localization and Preference Learning | Mar 9, 2025 | Bug fixingProgram Repair | CodeCode Available | 0 |
| Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol | Mar 7, 2025 | BenchmarkingBug fixing | —Unverified | 0 |
| Empirical evaluation of LLMs in predicting fixes of Configuration bugs in Smart Home System | Feb 16, 2025 | Bug fixing | —Unverified | 0 |
| Repository-level Code Search with Neural Retrieval Methods | Feb 10, 2025 | Bug fixingCode Search | CodeCode Available | 0 |
| GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation | Jan 19, 2025 | Bug fixingCode Completion | CodeCode Available | 0 |
| CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking | Dec 1, 2024 | Bug fixingCode Generation | CodeCode Available | 2 |
| An Empirical Study on LLM-based Agents for Automated Bug Fixing | Nov 15, 2024 | Bug fixingFault localization | —Unverified | 0 |
| A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation | Nov 12, 2024 | Bug fixingCode Generation | —Unverified | 0 |
| PDC & DM-SFT: A Road for LLM SQL Bug-Fix Enhancing | Nov 11, 2024 | Bug fixingCode Generation | —Unverified | 0 |
| MetRex: A Benchmark for Verilog Code Metric Reasoning Using LLMs | Nov 5, 2024 | Bug fixingCode Generation | CodeCode Available | 1 |
| Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers | Oct 23, 2024 | Bug fixing | —Unverified | 0 |
| Debug Smarter, Not Harder: AI Agents for Error Resolution in Computational Notebooks | Oct 18, 2024 | AI AgentBug fixing | —Unverified | 0 |
| From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging | Oct 2, 2024 | Auto DebuggingBug fixing | CodeCode Available | 2 |
| MarsCode Agent: AI-native Automated Bug Fixing | Sep 2, 2024 | Bug fixingCode Completion | —Unverified | 0 |
| Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests | Aug 21, 2024 | Bug fixingDescriptive | CodeCode Available | 1 |
| Patched RTC: evaluating LLMs for diverse software development tasks | Jul 23, 2024 | Bug fixingModel Selection | CodeCode Available | 0 |
| CodeR: Issue Resolving with Multi-Agent and Task Graphs | Jun 3, 2024 | Bug fixing | CodeCode Available | 2 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 |