SOTAVerified

HumanEval

Papers

Showing 101150 of 264 papers

TitleStatusHype
Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM WatermarkingCode1
ContraCLM: Contrastive Learning For Causal Language ModelCode1
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code GenerationCode0
Can Programming Languages Boost Each Other via Instruction Tuning?Code0
Instruction Fusion: Advancing Prompt Evolution through HybridizationCode0
HumanEval on Latest GPT Models -- 2024Code0
Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained SettingsCode0
Can Github issues be solved with Tree Of Thoughts?Code0
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization systemCode0
Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case StudyCode0
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation GuidanceCode0
Measuring the Influence of Incorrect Code on Test GenerationCode0
Large Language Models of Code Fail at Completing Code with Potential BugsCode0
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language ModelsCode0
Using Large Language Models to Generate JUnit Tests: An Empirical StudyCode0
ThrowBench: Benchmarking LLMs by Predicting Runtime ExceptionsCode0
Evaluating How Fine-tuning on Bimodal Data Effects Code GenerationCode0
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language ModelsCode0
Self-Correcting Code Generation Using Small Language ModelsCode0
Self-Edit: Fault-Aware Code Editor for Code GenerationCode0
Enhancing Large Language Models in Coding Through Multi-Perspective Self-ConsistencyCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
Multi-Programming Language Ensemble for Code Generation in Large Language ModelCode0
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language ModelsCode0
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code GenerationCode0
Comments as Natural Logic Pivots: Improve Code Generation via Comment PerspectiveCode0
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank AdaptationCode0
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect VerifiersCode0
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationCode0
Enhancing Code Generation via Bidirectional Comment-Level Mutual GroundingCode0
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code GenerationCode0
AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System NeedCode0
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising QualityCode0
CoCoNUT: Structural Code Understanding does not fall out of a treeCode0
InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code TranslationCode0
KV Prediction for Improved Time to First TokenCode0
Software Vulnerability and Functionality Assessment using LLMs0
ACECODER: Acing Coder RL via Automated Test-Case Synthesis0
Actor-Critic based Online Data Mixing For Language Model Pre-Training0
Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment0
Addressing Data Leakage in HumanEval Using Combinatorial Test Design0
AIME: AI System Optimization via Multiple LLM Evaluators0
Aligning CodeLLMs with Direct Preference Optimization0
AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement0
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks0
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks0
ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement0
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining0
A Review of Repository Level Prompting for LLMs0
CodingTeachLLM: Empowering LLM's Coding Ability via AST Prior Knowledge0
Show:102550
← PrevPage 3 of 6Next →

No leaderboard results yet.