SOTAVerified

HumanEval

Papers

Showing 8190 of 264 papers

TitleStatusHype
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality DataCode1
LeTI: Learning to Generate from Textual InteractionsCode1
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionCode1
Getting the most out of your tokenizer for pre-training and domain adaptationCode1
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent CollaborationCode1
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code GenerationCode1
How Efficient is LLM-Generated Code? A Rigorous & High-Standard BenchmarkCode1
Fault-Aware Neural Code RankersCode1
Better & Faster Large Language Models via Multi-token PredictionCode1
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language ModelsCode1
Show:102550
← PrevPage 9 of 27Next →

No leaderboard results yet.