| CodeMirage: Hallucinations in Code Generated by Large Language Models | Aug 14, 2024 | Code GenerationHallucination | —Unverified | 0 | 0 |
| CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts | May 8, 2025 | Code CompletionCode Generation | —Unverified | 0 | 0 |
| Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency | Jun 18, 2024 | HumanEvalmbpp | —Unverified | 0 | 0 |
| CodeShell Technical Report | Mar 23, 2024 | 8kHumanEval | —Unverified | 0 | 0 |
| CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models | Nov 7, 2024 | Code GenerationDecision Making | —Unverified | 0 | 0 |
| Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting | Aug 18, 2024 | HumanEvalMathematical Reasoning | —Unverified | 0 | 0 |
| Context-Augmented Code Generation Using Programming Knowledge Graphs | Oct 9, 2024 | Code GenerationHumanEval | —Unverified | 0 | 0 |
| CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks | Sep 13, 2024 | ARCCode Generation | —Unverified | 0 | 0 |
| CREST: Effectively Compacting a Datastore For Retrieval-Based Speculative Decoding | Aug 8, 2024 | HumanEvalRetrieval | —Unverified | 0 | 0 |
| CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution | Aug 23, 2024 | Code GenerationHumanEval | —Unverified | 0 | 0 |