SOTAVerified

Code Completion

Papers

Showing 101125 of 212 papers

TitleStatusHype
Neural Models for Source Code Synthesis and Completion0
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates0
Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context0
EffiBench: Benchmarking the Efficiency of Automatically Generated CodeCode2
Break the Sequential Dependency of LLM Inference Using Lookahead DecodingCode5
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler FeedbackCode2
OMPGPT: A Generative Pre-trained Transformer Model for OpenMP0
Can Large Language Models Write Parallel Code?Code1
LangBridge: Multilingual Reasoning Without Multilingual SupervisionCode2
When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model InferenceCode0
Traces of Memorisation in Large Language Models for CodeCode0
A Review of Repository Level Prompting for LLMs0
Breaking the Silence: the Threats of Using LLMs in Software EngineeringCode0
INSPECT: Intrinsic and Systematic Probing Evaluation for Code TransformersCode0
Interpretability Illusions in the Generalization of Simplified Models0
GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language UnderstandingCode0
Past as a Guide: Leveraging Retrospective Learning for Python Code Completion0
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications0
Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation0
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionCode1
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt CompressionCode5
Critique Ability of Large Language Models0
Ada-Instruct: Adapting Instruction Generators for Complex ReasoningCode1
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language ModelsCode1
LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language ModelsCode1
Show:102550
← PrevPage 5 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)77.13Unverified
2CodeT5+ 770MEM (line-level)37.9Unverified
3CodeT5+ 220MEM (line-level)35.17Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)75.11Unverified
2CodeT5+ 770MEM (line-level)44.86Unverified
3CodeT5+ 220MEM (line-level)43.42Unverified
#ModelMetricClaimedVerifiedStatus
1SantaCoder-MGDCompilation Rate73.03Unverified
2SantaCoderCompilation Rate59.97Unverified
3SantaCoderCompilation Rate59.79Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate76.47Unverified
2RepoCoderCompilation Rate74.02Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate61.7Unverified
2RepoCoderCompilation Rate58.09Unverified