SOTAVerified

HumanEval

Papers

Showing 201250 of 264 papers

TitleStatusHype
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models0
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation0
SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization0
Scattered Forest Search: Smarter Code Space Exploration with LLMs0
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity0
Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity0
SelfEvolve: A Code Evolution Framework via Large Language Models0
Self-Evolving Multi-Agent Collaboration Networks for Software Development0
Self-Explained Keywords Empower Large Language Models for Code Generation0
Semantic-guided Search for Efficient Program Repair with Large Language Models0
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models0
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths0
Stochastic Code Generation0
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency0
SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation0
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models0
Test-Driven Development for Code Generation0
Textbooks Are All You Need0
The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models0
The Program Testing Ability of Large Language Models for Code0
The Stack: 3 TB of permissively licensed source code0
Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement0
Threshold Filtering Packing for Supervised Fine-Tuning: Training Related Samples within Packs0
Towards Large Language Model Aided Program Refinement0
Turning the Tide: Repository-based Code Reflection0
Type-Constrained Code Generation with Language Models0
UnitCoder: Scalable Iterative Code Synthesis with Unit Test Guidance0
Validating LLM-Generated Programs with Metamorphic Prompt Testing0
VALTEST: Automated Validation of Language Model Generated Test Cases0
SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents0
Large Language Models Meet NL2Code: A Survey0
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language ModelsCode0
Enhancing Code Generation via Bidirectional Comment-Level Mutual GroundingCode0
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code GenerationCode0
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language ModelsCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
Multi-Programming Language Ensemble for Code Generation in Large Language ModelCode0
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code GenerationCode0
Large Language Models of Code Fail at Completing Code with Potential BugsCode0
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language ModelsCode0
Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case StudyCode0
Measuring the Influence of Incorrect Code on Test GenerationCode0
InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code TranslationCode0
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising QualityCode0
Instruction Fusion: Advancing Prompt Evolution through HybridizationCode0
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation GuidanceCode0
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect VerifiersCode0
ThrowBench: Benchmarking LLMs by Predicting Runtime ExceptionsCode0
HumanEval on Latest GPT Models -- 2024Code0
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationCode0
Show:102550
← PrevPage 5 of 6Next →

No leaderboard results yet.