SOTAVerified

zero-shot-classification

Papers

Showing 5175 of 422 papers

TitleStatusHype
Captured by Captions: On Memorization and its Mitigation in CLIP Models0
DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions0
LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation ModelsCode1
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image UnderstandingCode2
Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation ModelsCode0
KPL: Training-Free Medical Knowledge Mining of Vision-Language ModelsCode0
FLAVARS: A Multimodal Foundational Language and Vision Alignment Model for Remote Sensing0
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific LiteratureCode2
A Statistical Theory of Contrastive Pre-training and Multimodal Generative AICode0
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and ChallengesCode4
LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries0
Generalized Zero-Shot Classification via Semantics-Free Inter-Class Feature Generation0
Cross-Modal 3D Representation with Multi-View Images and Point Clouds0
Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio0
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language AlignmentCode0
Adaptive Pruning for Large Language Models with Structural Importance Awareness0
Zero-Shot Image Moderation in Google Ads with LLM-Assisted Textual Descriptions and Cross-modal Co-embeddings0
CRoF: CLIP-based Robust Few-shot Learning on Noisy Labels0
A Simple and Efficient Baseline for Zero-Shot Generative Classification0
An Efficient Framework for Enhancing Discriminative Models via Diffusion TechniquesCode0
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?Code0
SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level promptingCode1
Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning0
S^3: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models0
Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep LearningCode0
Show:102550
← PrevPage 3 of 17Next →

No leaderboard results yet.