SOTAVerified

Code Search

The goal of Code Search is to retrieve code fragments from a large code corpus that most closely match a developer’s intent, which is expressed in natural language.

Source: When Deep Learning Met Code Search

Papers

Showing 150 of 125 papers

TitleStatusHype
AutoCodeRover: Autonomous Program ImprovementCode7
RepoQA: Evaluating Long Context Code UnderstandingCode2
CodeSAM: Source Code Representation Learning by Infusing Self-Attention with Multi-Code-View GraphsCode2
CoIR: A Comprehensive Benchmark for Code Information Retrieval ModelsCode2
CoSQA: 20,000+ Web Queries for Code Search and Question AnsweringCode1
Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code SearchCode1
A Toolkit for Generating Code Knowledge GraphsCode1
XLCoST: A Benchmark Dataset for Cross-lingual Code IntelligenceCode1
One Adapter for All Programming Languages? Adapter Tuning for Code Search and SummarizationCode1
Zero-Shot Cross-Domain Code Search without Fine-TuningCode1
Multimodal Representation for Neural Code SearchCode1
Backdooring Neural Code SearchCode1
Rewriting the Code: A Simple Method for Large Language Model Augmented Code SearchCode1
GraphSearchNet: Enhancing GNNs via Capturing Global Dependencies for Semantic Code SearchCode1
Faster Person Re-IdentificationCode1
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and GenerationCode1
funcGNN: A Graph Neural Network Approach to Program SimilarityCode1
Text and Code Embeddings by Contrastive Pre-TrainingCode1
Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language IntentCode1
ViC: Virtual Compiler Is All You Need For Assembly Code SearchCode1
Language Models are Universal EmbeddersCode1
deGraphCS: Embedding Variable-based Flow Graph for Neural Code SearchCode1
On the Importance of Building High-quality Training Datasets for Neural Code SearchCode1
Source Code Clone Detection Using Unsupervised Similarity MeasuresCode1
Search4Code: Code Search Intent Classification Using Weak SupervisionCode1
PalmTree: Learning an Assembly Language Model for Instruction EmbeddingCode1
Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity MeasuresCode1
Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured DataCode1
Code Search based on Context-aware Code TranslationCode1
Rethinking Negative Pairs in Code SearchCode1
UniXcoder: Unified Cross-Modal Pre-training for Code RepresentationCode1
CodeSearchNet Challenge: Evaluating the State of Semantic Code SearchCode1
DOBF: A Deobfuscation Pre-Training Objective for Programming LanguagesCode1
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and GenerationCode1
Learning Deep Semantic Model for Code Search using CodeSearchNet CorpusCode1
Bridging Pre-trained Models and Downstream Tasks for Source Code UnderstandingCode1
Exploring Representation-Level Augmentation for Code SearchCode1
ContraCLM: Contrastive Learning For Causal Language ModelCode1
Repository-level Code Search with Neural Retrieval MethodsCode0
CoDesc: A Large Code-Description Parallel DatasetCode0
CodeRetriever: Unimodal and Bimodal Contrastive Learning for Code SearchCode0
Code Execution with Pre-trained Language ModelsCode0
REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search ModelsCode0
ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code SearchCode0
MELT: Mining Effective Lightweight Transformations from Pull RequestsCode0
Memorization and Generalization in Neural Code Intelligence ModelsCode0
Isotropy Matters: Soft-ZCA Whitening of Embeddings for Semantic Code SearchCode0
CoSQA+: Pioneering the Multi-Choice Code Search Benchmark with Test-Driven AgentsCode0
GraphCodeBERT: Pre-training Code Representations with Data FlowCode0
Generating Clarifying Questions for Query Refinement in Source Code SearchCode0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1cpt-code MOverall93.5Unverified
2cpt-code SOverall93.4Unverified
3CodeT5+ 770MOverall77.4Unverified
4GraphCodeBERTOverall77.4Unverified
5CodeT5+ 220MOverall77.1Unverified
6CodeBERTOverall76Unverified
#ModelMetricClaimedVerifiedStatus
1Self-attentionTest MRR0.84Unverified
2NBOWTest MRR0.81Unverified
3RNNTest MRR0.77Unverified
#ModelMetricClaimedVerifiedStatus
1CodeT5+ 770MMRR44.7Unverified
2CodeT5+ 220MMRR43.3Unverified
3CodeBERTMRR27.19Unverified
#ModelMetricClaimedVerifiedStatus
1Uni-SBTMRR0.36Unverified
#ModelMetricClaimedVerifiedStatus
1CodeBERTAccuracy47.8Unverified
#ModelMetricClaimedVerifiedStatus
1Voyage-code-002nDCG@1056.26Unverified