SOTAVerified

zero-shot-classification

Papers

Showing 125 of 422 papers

TitleStatusHype
DEARLi: Decoupled Enhancement of Recognition and Localization for Semi-supervised Panoptic SegmentationCode0
Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography0
Harmonizing and Merging Source Models for CLIP-based Domain Generalization0
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision ModelsCode1
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation0
AmorLIP: Efficient Language-Image Pretraining via AmortizationCode0
Beginning with You: Perceptual-Initialization Improves Vision-Language Representation and Alignment0
From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based SelectionCode1
Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image CorruptionCode0
StarFT: Robust Fine-tuning of Zero-shot Models via Spuriosity AlignmentCode0
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert ReasonerCode2
Advanced Crash Causation Analysis for Freeway Safety: A Large Language Model Approach to Identifying Key Contributing Factors0
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining0
Image Classification Using a Diffusion Model as a Pre-Training Model0
MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from TextbooksCode1
FG-CLIP: Fine-Grained Visual and Textual AlignmentCode4
Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning0
On the effectiveness of Large Language Models in the mechanical design domainCode0
Helping Large Language Models Protect Themselves: An Enhanced Filtering and Summarization System0
Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism DetectionCode0
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability0
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token PredictionCode1
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective0
CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization0
Show:102550
← PrevPage 1 of 17Next →

No leaderboard results yet.