SOTAVerified

Code Completion

Papers

Showing 51100 of 212 papers

TitleStatusHype
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file ContextCode1
Productivity Assessment of Neural Code CompletionCode1
Energy-Based Models for Code Generation under Compilability ConstraintsCode1
Empirical Study of Transformers for Source CodeCode1
MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with TransformersCode1
RAMBO: Enhancing RAG-based Repository-Level Method Body CompletionCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
LLMSecEval: A Dataset of Natural Language Prompts for Security EvaluationsCode1
Long Code Arena: a Set of Benchmarks for Long-Context Code ModelsCode1
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionCode1
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'Code1
Ada-Instruct: Adapting Instruction Generators for Complex ReasoningCode1
Language Models for Code Completion: A Practical EvaluationCode1
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective PartitioningCode1
Dataflow-Guided Retrieval Augmentation for Repository-Level Code CompletionCode1
Can Large Language Models Write Parallel Code?Code1
A Syntax-Guided Edit Decoder for Neural Program RepairCode1
Learning Deep Semantics for Test CompletionCode1
LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language ModelsCode1
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy PreservationCode1
ReACC: A Retrieval-Augmented Code Completion FrameworkCode1
How to Get Your LLM to Generate Challenging Problems for EvaluationCode1
Curriculum Learning for Small Code Language Models0
Critique Ability of Large Language Models0
ContextModule: Improving Code Completion via Repository-level Contextual Information0
Context Composing for Full Line Code Completion0
Compilable Neural Code Generation with Compiler Feedback0
A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning0
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks0
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair0
Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework0
Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing0
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents0
Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion0
A Review of Repository Level Prompting for LLMs0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts0
Jailbreak Attacks and Defenses Against Large Language Models: A Survey0
Benchmarking Causal Study to Interpret Large Language Models for Source Code0
HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding0
Insights from the Usage of the Ansible Lightspeed Code Completion Service0
JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking0
KV Prediction for Improved Time to First Token0
Automated Code Generation and Validation for Software Components of Microcontrollers0
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models0
GraphCodeBERT: Pre-training Code Representations with Data Flow0
CodeGemma: Open Code Models Based on Gemma0
Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning0
Interpretability Illusions in the Generalization of Simplified Models0
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions0
All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs0
Show:102550
← PrevPage 2 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)77.13Unverified
2CodeT5+ 770MEM (line-level)37.9Unverified
3CodeT5+ 220MEM (line-level)35.17Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)75.11Unverified
2CodeT5+ 770MEM (line-level)44.86Unverified
3CodeT5+ 220MEM (line-level)43.42Unverified
#ModelMetricClaimedVerifiedStatus
1SantaCoder-MGDCompilation Rate73.03Unverified
2SantaCoderCompilation Rate59.97Unverified
3SantaCoderCompilation Rate59.79Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate76.47Unverified
2RepoCoderCompilation Rate74.02Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate61.7Unverified
2RepoCoderCompilation Rate58.09Unverified