SOTAVerified

HumanEval

Papers

Showing 226250 of 264 papers

TitleStatusHype
Type-Constrained Code Generation with Language Models0
UnitCoder: Scalable Iterative Code Synthesis with Unit Test Guidance0
Validating LLM-Generated Programs with Metamorphic Prompt Testing0
VALTEST: Automated Validation of Language Model Generated Test Cases0
SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents0
Large Language Models Meet NL2Code: A Survey0
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language ModelsCode0
Enhancing Code Generation via Bidirectional Comment-Level Mutual GroundingCode0
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code GenerationCode0
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language ModelsCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
Multi-Programming Language Ensemble for Code Generation in Large Language ModelCode0
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code GenerationCode0
Large Language Models of Code Fail at Completing Code with Potential BugsCode0
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language ModelsCode0
Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case StudyCode0
Measuring the Influence of Incorrect Code on Test GenerationCode0
InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code TranslationCode0
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising QualityCode0
Instruction Fusion: Advancing Prompt Evolution through HybridizationCode0
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation GuidanceCode0
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect VerifiersCode0
ThrowBench: Benchmarking LLMs by Predicting Runtime ExceptionsCode0
HumanEval on Latest GPT Models -- 2024Code0
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationCode0
Show:102550
← PrevPage 10 of 11Next →

No leaderboard results yet.