SOTAVerified

HumanEval

Papers

Showing 4150 of 264 papers

TitleStatusHype
MapCoder: Multi-Agent Code Generation for Competitive Problem SolvingCode2
any4: Learned 4-bit Numeric Representation for LLMsCode2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsCode2
MasRouter: Learning to Route LLMs for Multi-Agent SystemsCode2
Instruction Tuning With Loss Over InstructionsCode1
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language ModelsCode1
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-InstructCode1
HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language GeneralizationCode1
Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM WatermarkingCode1
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code GenerationCode1
Show:102550
← PrevPage 5 of 27Next →

No leaderboard results yet.