SOTAVerified

Code Completion

Papers

Showing 5175 of 212 papers

TitleStatusHype
Learning Deep Semantics for Test CompletionCode1
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle TasksCode1
LLMSecEval: A Dataset of Natural Language Prompts for Security EvaluationsCode1
Long Code Arena: a Set of Benchmarks for Long-Context Code ModelsCode1
LambdaNet: Probabilistic Type Inference using Graph Neural NetworksCode1
A Syntax-Guided Edit Decoder for Neural Program RepairCode1
How to Get Your LLM to Generate Challenging Problems for EvaluationCode1
Ada-Instruct: Adapting Instruction Generators for Complex ReasoningCode1
Building A Coding Assistant via the Retrieval-Augmented Language ModelCode1
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionCode1
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'Code1
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective PartitioningCode1
Dataflow-Guided Retrieval Augmentation for Repository-Level Code CompletionCode1
Can Large Language Models Write Parallel Code?Code1
Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMsCode1
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation ModelsCode1
Language Models for Code Completion: A Practical EvaluationCode1
How Effective Are Neural Networks for Fixing Security VulnerabilitiesCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code GeneratorsCode1
Productivity Assessment of Neural Code CompletionCode1
Empirical Study of Transformers for Source CodeCode1
Curriculum Learning for Small Code Language Models0
Critique Ability of Large Language Models0
ContextModule: Improving Code Completion via Repository-level Contextual Information0
Show:102550
← PrevPage 3 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)77.13Unverified
2CodeT5+ 770MEM (line-level)37.9Unverified
3CodeT5+ 220MEM (line-level)35.17Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)75.11Unverified
2CodeT5+ 770MEM (line-level)44.86Unverified
3CodeT5+ 220MEM (line-level)43.42Unverified
#ModelMetricClaimedVerifiedStatus
1SantaCoder-MGDCompilation Rate73.03Unverified
2SantaCoderCompilation Rate59.97Unverified
3SantaCoderCompilation Rate59.79Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate76.47Unverified
2RepoCoderCompilation Rate74.02Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate61.7Unverified
2RepoCoderCompilation Rate58.09Unverified