SOTAVerified|Agents Browse Leaderboard About Blog

Code Completion

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–60 of 212 papers

Title	Date	Tasks	Status	Hype	Score
LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations	Mar 16, 2023	Code CompletionCode Generation	CodeCode Available	1	5
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation	Feb 9, 2021	BIG-bench Machine LearningClone Detection	CodeCode Available	1	5
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion	May 30, 2024	Code CompletionRetrieval	CodeCode Available	1	5
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators	Mar 6, 2024	Code CompletionCode Generation	CodeCode Available	1	5
Language Models for Code Completion: A Practical Evaluation	Feb 25, 2024	Code Completionvalid	CodeCode Available	1	5
DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective Partitioning	Sep 2, 2024	Code CompletionCombinatorial Optimization	CodeCode Available	1	5
Empirical Study of Transformers for Source Code	Oct 15, 2020	Bug fixingCode Completion	CodeCode Available	1	5
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation	Mar 25, 2025	Code CompletionLanguage Modeling	CodeCode Available	1	5
Building A Coding Assistant via the Retrieval-Augmented Language Model	Oct 21, 2024	Code CompletionCode Generation	CodeCode Available	1	5
CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection	Mar 12, 2025	BenchmarkingCode Classification	CodeCode Available	1	5

Show:10 25 50

← PrevPage 6 of 22Next →

All datasets SAFIM CodeXGLUE - Github Java Corpus CodeXGLUE - PY150 DotPrompts Defects4J Rambo Benchmark

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	deepseek-coder-33b-base	Average	69.01	—	Unverified
2	deepseek-coder-6.7b-base	Average	63.4	—	Unverified
3	starcoderbase	Average	55.54	—	Unverified
4	gpt-4-1106-preview	Average	53.28	—	Unverified
5	CodeLlama-13b-hf	Average	52.78	—	Unverified
6	deepseek-coder-1.3b-base	Average	52.63	—	Unverified
7	CodeLlama-34b-hf	Average	49.66	—	Unverified
8	CodeLlama-7b-hf	Average	45	—	Unverified
9	gpt-3.5-turbo-0301	Average	40.86	—	Unverified
10	incoder-6B	Average	33.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CodeGPT-adapted	Accuracy (token-level)	77.13	—	Unverified
2	CodeT5+ 770M	EM (line-level)	37.9	—	Unverified
3	CodeT5+ 220M	EM (line-level)	35.17	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CodeGPT-adapted	Accuracy (token-level)	75.11	—	Unverified
2	CodeT5+ 770M	EM (line-level)	44.86	—	Unverified
3	CodeT5+ 220M	EM (line-level)	43.42	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SantaCoder-MGD	Compilation Rate	73.03	—	Unverified
2	SantaCoder	Compilation Rate	59.97	—	Unverified
3	SantaCoder	Compilation Rate	59.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Rambo	Compilation Rate	76.47	—	Unverified
2	RepoCoder	Compilation Rate	74.02	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Rambo	Compilation Rate	61.7	—	Unverified
2	RepoCoder	Compilation Rate	58.09	—	Unverified