SOTAVerified

HumanEval

Papers

Showing 126150 of 264 papers

TitleStatusHype
InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code TranslationCode0
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank AdaptationCode0
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code GenerationCode0
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code GenerationCode0
Enhancing Code Generation via Bidirectional Comment-Level Mutual GroundingCode0
AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System NeedCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect VerifiersCode0
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language ModelsCode0
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising QualityCode0
Software Vulnerability and Functionality Assessment using LLMs0
ACECODER: Acing Coder RL via Automated Test-Case Synthesis0
Actor-Critic based Online Data Mixing For Language Model Pre-Training0
Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment0
Addressing Data Leakage in HumanEval Using Combinatorial Test Design0
AIME: AI System Optimization via Multiple LLM Evaluators0
Aligning CodeLLMs with Direct Preference Optimization0
AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement0
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks0
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks0
ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement0
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining0
A Review of Repository Level Prompting for LLMs0
CodingTeachLLM: Empowering LLM's Coding Ability via AST Prior Knowledge0
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection0
Show:102550
← PrevPage 6 of 11Next →

No leaderboard results yet.