SOTAVerified

HumanEval

Papers

Showing 126150 of 264 papers

TitleStatusHype
CodeMirage: Hallucinations in Code Generated by Large Language Models0
CREST: Effectively Compacting a Datastore For Retrieval-Based Speculative Decoding0
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph DatabasesCode7
ArchCode: Incorporating Software Requirements in Code Generation with Large Language ModelsCode1
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models0
Discrete Flow Matching0
Scaling Granite Code Models to 128K ContextCode4
Qwen2 Technical ReportCode13
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants0
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-InstructCode1
Brevity is the soul of wit: Pruning long files for code generation0
Towards Large Language Model Aided Program Refinement0
RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository ScaleCode1
Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models0
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency0
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All ToolsCode14
ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank AdaptationCode0
Reactor Mk.1 performances: MMLU, HumanEval and BBH test results0
PLUM: Improving Code LMs with Execution-Guided On-Policy Preference Learning Driven By Synthetic Test Cases0
Validating LLM-Generated Programs with Metamorphic Prompt Testing0
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language ModelsCode0
How Efficient is LLM-Generated Code? A Rigorous & High-Standard BenchmarkCode1
Does your data spark joy? Performance gains from domain upsampling at the end of training0
SemCoder: Training Code Language Models with Comprehensive Semantics ReasoningCode1
Automatic Instruction Evolving for Large Language ModelsCode3
Show:102550
← PrevPage 6 of 11Next →

No leaderboard results yet.