SOTAVerified

Code Search

The goal of Code Search is to retrieve code fragments from a large code corpus that most closely match a developer’s intent, which is expressed in natural language.

Source: When Deep Learning Met Code Search

Papers

Showing 150 of 125 papers

TitleStatusHype
AutoCodeRover: Autonomous Program ImprovementCode7
CodeSAM: Source Code Representation Learning by Infusing Self-Attention with Multi-Code-View GraphsCode2
CoIR: A Comprehensive Benchmark for Code Information Retrieval ModelsCode2
RepoQA: Evaluating Long Context Code UnderstandingCode2
Zero-Shot Cross-Domain Code Search without Fine-TuningCode1
ViC: Virtual Compiler Is All You Need For Assembly Code SearchCode1
Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity MeasuresCode1
Source Code Clone Detection Using Unsupervised Similarity MeasuresCode1
Rewriting the Code: A Simple Method for Large Language Model Augmented Code SearchCode1
Rethinking Negative Pairs in Code SearchCode1
Language Models are Universal EmbeddersCode1
Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured DataCode1
Backdooring Neural Code SearchCode1
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and GenerationCode1
One Adapter for All Programming Languages? Adapter Tuning for Code Search and SummarizationCode1
Exploring Representation-Level Augmentation for Code SearchCode1
ContraCLM: Contrastive Learning For Causal Language ModelCode1
XLCoST: A Benchmark Dataset for Cross-lingual Code IntelligenceCode1
UniXcoder: Unified Cross-Modal Pre-training for Code RepresentationCode1
Code Search based on Context-aware Code TranslationCode1
On the Importance of Building High-quality Training Datasets for Neural Code SearchCode1
Learning Deep Semantic Model for Code Search using CodeSearchNet CorpusCode1
Text and Code Embeddings by Contrastive Pre-TrainingCode1
Bridging Pre-trained Models and Downstream Tasks for Source Code UnderstandingCode1
GraphSearchNet: Enhancing GNNs via Capturing Global Dependencies for Semantic Code SearchCode1
Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code SearchCode1
Multimodal Representation for Neural Code SearchCode1
CoSQA: 20,000+ Web Queries for Code Search and Question AnsweringCode1
deGraphCS: Embedding Variable-based Flow Graph for Neural Code SearchCode1
DOBF: A Deobfuscation Pre-Training Objective for Programming LanguagesCode1
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and GenerationCode1
PalmTree: Learning an Assembly Language Model for Instruction EmbeddingCode1
Search4Code: Code Search Intent Classification Using Weak SupervisionCode1
Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language IntentCode1
Faster Person Re-IdentificationCode1
funcGNN: A Graph Neural Network Approach to Program SimilarityCode1
A Toolkit for Generating Code Knowledge GraphsCode1
CodeSearchNet Challenge: Evaluating the State of Semantic Code SearchCode1
MGS3: A Multi-Granularity Self-Supervised Code Search Framework0
DeepRTL2: A Versatile Model for RTL-Related Tasks0
LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models0
Knowledge Graph Based Repository-Level Code Generation0
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks0
Towards Leveraging Large Language Model Summaries for Topic Modeling in Source Code0
A Study on Mixup-Inspired Augmentation Methods for Software Vulnerability Detection0
OASIS: Order-Augmented Strategy for Improved Code Search0
LoRACode: LoRA Adapters for Code Embeddings0
MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings0
Beyond Natural Language Perplexity: Detecting Dead Code Poisoning in Code Generation Datasets0
URECA: The Chain of Two Minimum Set Cover Problems exists behind Adaptation to Shifts in Semantic Code Search0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1cpt-code MOverall93.5Unverified
2cpt-code SOverall93.4Unverified
3CodeT5+ 770MOverall77.4Unverified
4GraphCodeBERTOverall77.4Unverified
5CodeT5+ 220MOverall77.1Unverified
6CodeBERTOverall76Unverified
#ModelMetricClaimedVerifiedStatus
1Self-attentionTest MRR0.84Unverified
2NBOWTest MRR0.81Unverified
3RNNTest MRR0.77Unverified
#ModelMetricClaimedVerifiedStatus
1CodeT5+ 770MMRR44.7Unverified
2CodeT5+ 220MMRR43.3Unverified
3CodeBERTMRR27.19Unverified
#ModelMetricClaimedVerifiedStatus
1Uni-SBTMRR0.36Unverified
#ModelMetricClaimedVerifiedStatus
1CodeBERTAccuracy47.8Unverified
#ModelMetricClaimedVerifiedStatus
1Voyage-code-002nDCG@1056.26Unverified