SOTAVerified

HumanEval

Papers

Showing 6170 of 264 papers

TitleStatusHype
ArchCode: Incorporating Software Requirements in Code Generation with Large Language ModelsCode1
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code GenerationCode1
HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language GeneralizationCode1
Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM WatermarkingCode1
HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM SystemsCode1
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'Code1
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality DataCode1
Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct DecodingCode1
Getting the most out of your tokenizer for pre-training and domain adaptationCode1
How Efficient is LLM-Generated Code? A Rigorous & High-Standard BenchmarkCode1
Show:102550
← PrevPage 7 of 27Next →

No leaderboard results yet.