SOTAVerified

Code Completion

Papers

Showing 150 of 212 papers

TitleStatusHype
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents0
Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models0
Seed-Coder: Let the Code Model Curate Data for ItselfCode4
HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding0
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software DevelopmentCode2
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification0
Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion0
Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security PerspectiveCode0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts0
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents0
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks0
NoEsis: Differentially Private Knowledge Transfer in Modular LLM Adaptation0
EduBot -- Can LLMs Solve Personalized Learning and Programming Assignments?0
RTLRepoCoder: Repository-Level RTL Code Completion through the Combination of Fine-Tuning and Retrieval Augmentation0
CCCI: Code Completion with Contextual Information for Complex Data Transfer Tasks Using Large Language Models0
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation GroundingCode0
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy PreservationCode1
On Explaining (Large) Language Models For Code Using Global Code-Based Explanations0
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-TuningCode0
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
GenAI for Simulation Model in Model-Based Systems Engineering0
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation0
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair0
Alchemist: Towards the Design of Efficient Online Continual Learning System0
Automated Code Generation and Validation for Software Components of Microcontrollers0
LongSpec: Long-Context Speculative Decoding with Efficient Drafting and VerificationCode2
Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework0
Mechanistic Understanding of Language Models in Syntactic Code Completion0
How to Get Your LLM to Generate Challenging Problems for EvaluationCode1
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task0
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code GenerationCode0
Improving FIM Code Completions via Context & Curriculum Based Learning0
ExecRepoBench: Multi-level Executable Code Completion Evaluation0
ContextModule: Improving Code Completion via Repository-level Contextual Information0
Protect Your Secrets: Understanding and Measuring Data Exposure in VSCode Extensions0
LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation0
FastDraft: How to Train Your Draft0
CoCoP: Enhancing Text Classification with LLM through Code Completion Prompt0
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation ModelsCode1
JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking0
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'Code1
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation0
Building A Coding Assistant via the Retrieval-Augmented Language ModelCode1
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot FrameworkCode9
aiXcoder-7B: A Lightweight and Effective Large Language Model for Code ProcessingCode7
KV Prediction for Improved Time to First TokenCode0
CursorCore: Assist Programming through Aligning AnythingCode2
CodeCipher: Learning to Obfuscate Source Code Against LLMs0
Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning0
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion?Code2
Show:102550
← PrevPage 1 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)77.13Unverified
2CodeT5+ 770MEM (line-level)37.9Unverified
3CodeT5+ 220MEM (line-level)35.17Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)75.11Unverified
2CodeT5+ 770MEM (line-level)44.86Unverified
3CodeT5+ 220MEM (line-level)43.42Unverified
#ModelMetricClaimedVerifiedStatus
1SantaCoder-MGDCompilation Rate73.03Unverified
2SantaCoderCompilation Rate59.97Unverified
3SantaCoderCompilation Rate59.79Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate76.47Unverified
2RepoCoderCompilation Rate74.02Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate61.7Unverified
2RepoCoderCompilation Rate58.09Unverified