SOTAVerified

zero-shot-classification

Papers

Showing 251300 of 422 papers

TitleStatusHype
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval0
Towards Realistic Zero-Shot Classification via Self Structural Semantic AlignmentCode1
Adversarial Illusions in Multi-Modal EmbeddingsCode1
Image-free Classifier Injection for Zero-Shot ClassificationCode1
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability0
Robustifying Point Cloud Networks by RefocusingCode0
ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain AdaptationCode1
PerceptionCLIP: Visual Classification by Inferring and Conditioning on ContextsCode1
Developing and Evaluating Tiny to Medium-Sized Turkish BERT Models0
PRIOR: Prototype Representation Joint Learning from Medical Images and ReportsCode1
MineralImage5k: A benchmark for zero-shot raw mineral visual recognition and descriptionCode1
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP0
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote SensingCode2
RemoteCLIP: A Vision Language Foundation Model for Remote SensingCode2
Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation0
Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language ModelsCode0
Analysis of the Fed's communication by using textual entailment model of Zero-Shot classification0
UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment AnalysisCode1
Multi-level Cross-modal Feature Alignment via Contrastive Learning towards Zero-shot Classification of Remote Sensing Image ScenesCode0
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation LearningCode2
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language ModelsCode1
Improved Probabilistic Image-Text RepresentationsCode1
Adapting Language-Audio Models as Few-Shot Audio Learners0
DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification0
OverPrompt: Enhancing ChatGPT through Efficient In-Context LearningCode0
S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist CaptionsCode1
Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science0
Parts of Speech-Grounded Subspaces in Vision-Language ModelsCode1
LLM-Pruner: On the Structural Pruning of Large Language ModelsCode3
MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical Images and TextsCode1
ULIP-2: Towards Scalable Multimodal Pre-training for 3D UnderstandingCode2
Boosting Visual-Language Models by Exploiting Hard SamplesCode0
The Benefits of Label-Description Training for Zero-Shot Text ClassificationCode0
Unsupervised Improvement of Audio-Text Cross-Modal RepresentationsCode0
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification TasksCode1
CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information RetrievalCode0
WYTIWYR: A User Intent-Aware Framework with Multi-modal Inputs for Visualization RetrievalCode0
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)Code1
What does CLIP know about a red circle? Visual prompt engineering for VLMs0
RECLIP: Resource-efficient CLIP by Training with Small Images0
Exploring Vision-Language Models for Imbalanced LearningCode1
SoftCLIP: Softer Cross-modal Alignment Makes CLIP Stronger0
Your Diffusion Model is Secretly a Zero-Shot ClassifierCode2
Evaluation of ChatGPT for NLP-based Mental Health Applications0
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection0
Frozen Language Model Helps ECG Zero-Shot Learning0
Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification0
Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor AttacksCode1
Exploiting the Textual Potential from Vision-Language Pre-training for Text-based Person Search0
Describe me an Aucklet: Generating Grounded Perceptual Category DescriptionsCode0
Show:102550
← PrevPage 6 of 9Next →

No leaderboard results yet.