SOTAVerified

Code Completion

Papers

Showing 51100 of 212 papers

TitleStatusHype
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference PassCode1
RAMBO: Enhancing RAG-based Repository-Level Method Body CompletionCode1
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file ContextCode1
ReACC: A Retrieval-Augmented Code Completion FrameworkCode1
Multi-lingual Evaluation of Code Generation ModelsCode1
Empirical Study of Transformers for Source CodeCode1
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted ProgrammingCode1
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE DetectionCode1
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy PreservationCode1
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code CompletionCode1
A Syntax-Guided Edit Decoder for Neural Program RepairCode1
LambdaNet: Probabilistic Type Inference using Graph Neural NetworksCode1
Ada-Instruct: Adapting Instruction Generators for Complex ReasoningCode1
Learning Deep Semantics for Test CompletionCode1
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective PartitioningCode1
Dataflow-Guided Retrieval Augmentation for Repository-Level Code CompletionCode1
Can Large Language Models Write Parallel Code?Code1
LLMSecEval: A Dataset of Natural Language Prompts for Security EvaluationsCode1
MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation LearningCode1
Long Code Arena: a Set of Benchmarks for Long-Context Code ModelsCode1
RepoBench: Benchmarking Repository-Level Code Auto-Completion SystemsCode1
How to Get Your LLM to Generate Challenging Problems for EvaluationCode1
Curriculum Learning for Small Code Language Models0
Critique Ability of Large Language Models0
ContextModule: Improving Code Completion via Repository-level Contextual Information0
Context Composing for Full Line Code Completion0
Compilable Neural Code Generation with Compiler Feedback0
A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning0
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks0
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair0
Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework0
Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing0
Beyond Autocomplete: Designing CopilotLens Towards Transparent and Explainable AI Coding Agents0
Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion0
A Review of Repository Level Prompting for LLMs0
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts0
Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion0
Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning0
Benchmarking Causal Study to Interpret Large Language Models for Source Code0
HiLDe: Intentional Code Generation via Human-in-the-Loop Decoding0
Insights from the Usage of the Ansible Lightspeed Code Completion Service0
Automated Code Generation and Validation for Software Components of Microcontrollers0
Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension0
CodeGemma: Open Code Models Based on Gemma0
TaskEval: Assessing Difficulty of Code Generation Tasks for Large Language Models0
Unveiling Code Pre-Trained Models: Investigating Syntax and Semantics Capacities0
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions0
All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs0
GenAI for Simulation Model in Model-Based Systems Engineering0
Full Line Code Completion: Bringing AI to Desktop0
Show:102550
← PrevPage 2 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1deepseek-coder-33b-baseAverage69.01Unverified
2deepseek-coder-6.7b-baseAverage63.4Unverified
3starcoderbaseAverage55.54Unverified
4gpt-4-1106-previewAverage53.28Unverified
5CodeLlama-13b-hfAverage52.78Unverified
6deepseek-coder-1.3b-baseAverage52.63Unverified
7CodeLlama-34b-hfAverage49.66Unverified
8CodeLlama-7b-hfAverage45Unverified
9gpt-3.5-turbo-0301Average40.86Unverified
10incoder-6BAverage33.79Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)77.13Unverified
2CodeT5+ 770MEM (line-level)37.9Unverified
3CodeT5+ 220MEM (line-level)35.17Unverified
#ModelMetricClaimedVerifiedStatus
1CodeGPT-adaptedAccuracy (token-level)75.11Unverified
2CodeT5+ 770MEM (line-level)44.86Unverified
3CodeT5+ 220MEM (line-level)43.42Unverified
#ModelMetricClaimedVerifiedStatus
1SantaCoder-MGDCompilation Rate73.03Unverified
2SantaCoderCompilation Rate59.97Unverified
3SantaCoderCompilation Rate59.79Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate76.47Unverified
2RepoCoderCompilation Rate74.02Unverified
#ModelMetricClaimedVerifiedStatus
1RamboCompilation Rate61.7Unverified
2RepoCoderCompilation Rate58.09Unverified