SOTAVerified

Code Completion

Papers

Showing 110 of 212 papers

TitleStatusHype
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents0
Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models0
Seed-Coder: Let the Code Model Curate Data for ItselfCode4
HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding0
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software DevelopmentCode2
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification0
Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion0
Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security PerspectiveCode0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts0
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents0
Show:102550
← PrevPage 1 of 22Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified