SOTAVerified

HumanEval

Papers

Showing 1120 of 264 papers

TitleStatusHype
WizardCoder: Empowering Code Large Language Models with Evol-InstructCode5
StarCoder: may the source be with you!Code5
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-XCode5
Scaling Granite Code Models to 128K ContextCode4
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-stepCode4
CRUXEval: A Benchmark for Code Reasoning, Understanding and ExecutionCode4
Magicoder: Empowering Code Generation with OSS-InstructCode4
Baichuan 2: Open Large-scale Language ModelsCode4
Reflexion: Language Agents with Verbal Reinforcement LearningCode4
Web-Bench: A LLM Code Benchmark Based on Web Standards and FrameworksCode3
Show:102550
← PrevPage 2 of 27Next →

No leaderboard results yet.