| CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks | Jul 14, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving | Jul 8, 2025 | Code RepairTransfer Learning | CodeCode Available | 3 |
| Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents | May 30, 2025 | BenchmarkingCode Repair | —Unverified | 0 |
| CrashFixer: A crash resolution agent for the Linux kernel | Apr 29, 2025 | Code Repair | —Unverified | 0 |
| How Accurately Do Large Language Models Understand Code? | Apr 6, 2025 | Code GenerationCode Repair | —Unverified | 0 |
| Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors | Mar 28, 2025 | BenchmarkingCode Generation | CodeCode Available | 0 |
| RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts | Mar 27, 2025 | Code RepairFeature Engineering | —Unverified | 0 |
| SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair | Mar 3, 2025 | Code CompletionCode Repair | —Unverified | 0 |
| AuPair: Golden Example Pairs for Code Repair | Feb 12, 2025 | Code RepairIn-Context Learning | —Unverified | 0 |
| Fortran2CPP: Automating Fortran-to-C++ Translation using LLMs via Multi-Turn Dialogue and Dual-Agent Integration | Dec 27, 2024 | C++ codeCode Repair | CodeCode Available | 1 |
| LLM-Aided Efficient Hardware Design Automation | Oct 24, 2024 | Code RepairLogical Reasoning | —Unverified | 0 |
| CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair | Sep 19, 2024 | Code GenerationCode Repair | CodeCode Available | 0 |
| Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs | Sep 1, 2024 | Code Repair | —Unverified | 0 |
| COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis | Aug 9, 2024 | Code GenerationCode Repair | CodeCode Available | 1 |
| Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval | Jul 2, 2024 | Code GenerationCode Repair | —Unverified | 0 |
| Investigating the Transferability of Code Repair for Low-Resource Programming Languages | Jun 21, 2024 | Code GenerationCode Repair | —Unverified | 0 |
| SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents | Jun 18, 2024 | Code GenerationCode Repair | CodeCode Available | 2 |
| Code Repair with LLMs gives an Exploration-Exploitation Tradeoff | May 26, 2024 | Code RepairLanguage Modeling | —Unverified | 0 |
| AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct | May 23, 2024 | Class-level Code GenerationCode Completion | CodeCode Available | 4 |
| DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models | Feb 19, 2024 | Code RepairFew-Shot Learning | —Unverified | 0 |
| Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models | Jan 13, 2024 | Code GenerationCode Repair | —Unverified | 0 |
| Enhanced Automated Code Vulnerability Repair using Large Language Models | Jan 8, 2024 | C++ codeCode Repair | —Unverified | 0 |
| INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair | Nov 16, 2023 | Code GenerationCode Repair | CodeCode Available | 1 |
| Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation | Oct 25, 2023 | Code GenerationCode Repair | —Unverified | 0 |
| OctoPack: Instruction Tuning Code Large Language Models | Aug 14, 2023 | Code GenerationCode Repair | CodeCode Available | 3 |