SOTAVerified|Agents Browse Leaderboard About

mbpp

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 129 papers

Title	Date	Tasks	Status	Hype
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models	May 15, 2025	Code GenerationGSM8K	—Unverified	0
Rethinking Repetition Problems of LLMs in Code Generation	May 15, 2025	Code GenerationHumanEval	CodeCode Available	1
Web-Bench: A LLM Code Benchmark Based on Web Standards and Frameworks	May 12, 2025	Code Generation	CodeCode Available	3
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts	May 8, 2025	Code CompletionCode Generation	—Unverified	0
DataDecide: How to Predict Best Pretraining Data with Small Experiments	Apr 15, 2025	ARCHellaSwag	CodeCode Available	3
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning	Apr 14, 2025	Mathematical Reasoningmbpp	CodeCode Available	2
Type-Constrained Code Generation with Language Models	Apr 12, 2025	Code GenerationHumanEval	—Unverified	0
OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs	Apr 5, 2025	Code GenerationHumanEval	—Unverified	0
DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation	Mar 13, 2025	Code Generationmbpp	—Unverified	0
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?	Mar 7, 2025	Code GenerationHumanEval	—Unverified	0

Show:10 25 50

← PrevPage 2 of 13Next →

No leaderboard results yet.