Code Generation

Code Generation is an important field to predict explicit code or program structure from multimodal data sources such as incomplete code, programs in another programming language, natural language descriptions or execution examples. Code Generation tools can assist the development of automatic programming tools to improve programming productivity.

Source: Deep Learning for Source Code Modeling and Generation

Image source: Measuring Coding Challenge Competence With APPS

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 1697 papers

Title	Date	Tasks	Status	Hype
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning	Jul 18, 2025	Code GenerationGPU	—Unverified	0
Towards Formal Verification of LLM-Generated Code from Natural Language Prompts	Jul 17, 2025	Code Generation	—Unverified	0
MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks	Jul 16, 2025	Code Generation	—Unverified	0
Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training	Jul 16, 2025	Code GenerationMath	—Unverified	0
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs	Jul 15, 2025	Code GenerationSafety Alignment	CodeCode Available	2
Turning the Tide: Repository-based Code Reflection	Jul 14, 2025	Code GenerationDiversity	—Unverified	0
CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance	Jul 14, 2025	BenchmarkingCode Generation	—Unverified	0
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks	Jul 14, 2025	BenchmarkingCode Generation	—Unverified	0
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding	Jul 14, 2025	Code GenerationLanguage Modeling	CodeCode Available	9
Multilingual Multimodal Software Developer for Code Generation	Jul 11, 2025	Code GenerationInstruction Following	—Unverified	0

Show:10 25 50

← PrevPage 1 of 170Next →

All datasets MBPP APPS CoNaLa Django WikiSQL RES-Q CodeContests HumanEval PECC WebApp1K-React CoNaLa-Ext WebApp1k-Duo-React

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	EG-CFG (DeepSeek-V3-0324)	Accuracy	96.6	—	Unverified
2	QualityFlow (Sonnet-3.5)	Accuracy	94.2	—	Unverified
3	o1-mini + MapCoder (Hamming.ai)	Accuracy	93.2	—	Unverified
4	MGDebugger (DeepSeek-V3-0324)	Accuracy	92.4	—	Unverified
5	GPT-4 + AgentCoder	Accuracy	91.8	—	Unverified
6	CodeSim (GPT4o)	Accuracy	90.7	—	Unverified
7	Jiutian-大模型	Accuracy	90	—	Unverified
8	GPT-3.5 Turbo (ChatGPT) + AgentCoder	Accuracy	89.9	—	Unverified
9	MapCoder (GPT-4o)	Accuracy	89.7	—	Unverified
10	GPT-4 (ChatGPT Plus)	Accuracy	87.5	—	Unverified