SOTAVerified

HumanEval

Papers

Showing 176200 of 264 papers

TitleStatusHype
Does Few-Shot Learning Help LLM Performance in Code Synthesis?0
Does your data spark joy? Performance gains from domain upsampling at the end of training0
DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation0
Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference0
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs0
Dynamic Scaling of Unit Tests for Code Reward Modeling0
Structured Chain-of-Thought Prompting for Code Generation0
Enhancing LLM-Based Code Generation with Complexity Metrics: A Feedback-Driven Approach0
Evaluating Large Language Models for Code Review0
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?0
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation0
Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree0
From Output to Evaluation: Does Raw Instruction-Tuned Code LLMs Output Suffice for Fill-in-the-Middle Code Generation?0
Fully Autonomous Programming using Iterative Multi-Agent Debugging with Large Language Models0
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks0
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?0
GRIN: GRadient-INformed MoE0
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees0
Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks0
Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation0
Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities0
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models0
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion0
Interactive Code Generation via Test-Driven User-Intent Formalization0
Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval0
Show:102550
← PrevPage 8 of 11Next →

No leaderboard results yet.