| Discrete Flow Matching | Jul 22, 2024 | HumanEvalmbpp | —Unverified | 0 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 |
| Qwen2 Technical Report | Jul 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 13 |
| MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants | Jul 12, 2024 | HumanEval | —Unverified | 0 |
| InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct | Jul 8, 2024 | Code GenerationCode Summarization | CodeCode Available | 1 |
| Brevity is the soul of wit: Pruning long files for code generation | Jun 29, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| Towards Large Language Model Aided Program Refinement | Jun 26, 2024 | HumanEvalLanguage Modeling | —Unverified | 0 |
| RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale | Jun 24, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models | Jun 20, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency | Jun 18, 2024 | HumanEvalmbpp | —Unverified | 0 |