| Leveraging Metamemory Mechanisms for Enhanced Data-Free Code Generation in LLMs | Jan 14, 2025 | Code GenerationHumanEval | —Unverified | 0 |
| Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks | Jan 11, 2025 | Code GenerationHumanEval | —Unverified | 0 |
| Dafny as Verification-Aware Intermediate Language for Code Generation | Jan 10, 2025 | Code GenerationHumanEval | —Unverified | 0 |
| InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion | Jan 6, 2025 | GSM8KHumanEval | —Unverified | 0 |
| Dynamic Scaling of Unit Tests for Code Reward Modeling | Jan 2, 2025 | Code GenerationHumanEval | —Unverified | 0 |
| Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement | Dec 30, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation | Dec 30, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity | Dec 30, 2024 | BenchmarkingCode Generation | —Unverified | 0 |
| Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference | Dec 25, 2024 | CPUGPU | —Unverified | 0 |
| Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models | Dec 18, 2024 | HumanEvalImitation Learning | —Unverified | 0 |