| CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion | Oct 17, 2023 | Code CompletionHumanEval | CodeCode Available | 1 |
| CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules | Oct 13, 2023 | Code GenerationHumanEval | CodeCode Available | 1 |
| A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration | Oct 3, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation | Aug 3, 2023 | Class-level Code GenerationCode Generation | CodeCode Available | 1 |
| Predicting Code Coverage without Execution | Jul 25, 2023 | HumanEval | CodeCode Available | 1 |
| Is Self-Repair a Silver Bullet for Code Generation? | Jun 16, 2023 | Code GenerationHumanEval | CodeCode Available | 1 |
| ANPL: Towards Natural Programming with Interactive Decomposition | May 29, 2023 | ARCCode Generation | CodeCode Available | 1 |
| LeTI: Learning to Generate from Textual Interactions | May 17, 2023 | Code GenerationEvent Argument Extraction | CodeCode Available | 1 |
| ReCode: Robustness Evaluation of Code Generation Models | Dec 20, 2022 | Code GenerationHumanEval | CodeCode Available | 1 |
| Multi-lingual Evaluation of Code Generation Models | Oct 26, 2022 | Code CompletionCode Generation | CodeCode Available | 1 |