SOTAVerified

zero-shot-classification

Papers

Showing 150 of 422 papers

TitleStatusHype
DEARLi: Decoupled Enhancement of Recognition and Localization for Semi-supervised Panoptic SegmentationCode0
Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography0
Harmonizing and Merging Source Models for CLIP-based Domain Generalization0
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision ModelsCode1
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation0
AmorLIP: Efficient Language-Image Pretraining via AmortizationCode0
Beginning with You: Perceptual-Initialization Improves Vision-Language Representation and Alignment0
From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based SelectionCode1
StarFT: Robust Fine-tuning of Zero-shot Models via Spuriosity AlignmentCode0
Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image CorruptionCode0
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert ReasonerCode2
Advanced Crash Causation Analysis for Freeway Safety: A Large Language Model Approach to Identifying Key Contributing Factors0
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining0
Image Classification Using a Diffusion Model as a Pre-Training Model0
MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from TextbooksCode1
FG-CLIP: Fine-Grained Visual and Textual AlignmentCode4
Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning0
On the effectiveness of Large Language Models in the mechanical design domainCode0
Helping Large Language Models Protect Themselves: An Enhanced Filtering and Summarization System0
Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism DetectionCode0
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Radiology with Zero-Shot Multi-Task Capability0
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token PredictionCode1
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective0
CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization0
ViLAaD: Enhancing "Attracting and Dispersing'' Source-Free Domain Adaptation with Vision-and-Language Model0
Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning0
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection0
Bayesian Modeling of Zero-Shot Classifications for Urban Flood DetectionCode0
Advancing Medical Representation Learning Through High-Quality DataCode1
Real-Time Cell Sorting with Scalable In Situ FPGA-Accelerated Deep LearningCode0
TLAC: Two-stage LMM Augmented CLIP for Zero-Shot ClassificationCode0
Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images0
Controlling Latent Diffusion Using Latent CLIPCode1
DiffCLIP: Differential Attention Meets CLIPCode2
OFF-CLIP: Improving Normal Detection Confidence in Radiology CLIP with Simple Off-Diagonal Term Auto-AdjustmentCode0
A Zero-Shot Learning Approach for Ephemeral Gully Detection from Remote Sensing using Vision Language Models0
Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study0
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning0
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object RepresentationCode1
Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs0
Progressive Local Alignment for Medical Multimodal Pre-training0
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot ClassificationCode1
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting0
Using tournaments to calculate AUROC for zero-shot classification with LLMs0
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense FeaturesCode0
Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning0
Text Classification in the LLM Era - Where do we stand?0
Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering0
From Haystack to Needle: Label Space Reduction for Zero-shot Classification0
Show:102550
← PrevPage 1 of 9Next →

No leaderboard results yet.