SOTAVerified

HumanEval

Papers

Showing 111120 of 264 papers

TitleStatusHype
An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks0
Kotlin ML Pack: Technical Report0
Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks0
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees0
GRIN: GRadient-INformed MoE0
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?0
CodeShell Technical Report0
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency0
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts0
Show:102550
← PrevPage 12 of 27Next →

No leaderboard results yet.