| S^3: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models | Dec 6, 2024 | zero-shot-classificationZero-shot Generalization | —Unverified | 0 |
| Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning | Dec 5, 2024 | Comment GenerationDecoder | CodeCode Available | 0 |
| Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks | Dec 3, 2024 | ClassificationScene Classification | CodeCode Available | 0 |
| Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIP | Dec 1, 2024 | Natural Language Understandingzero-shot-classification | CodeCode Available | 0 |
| Active Data Curation Effectively Distills Large-Scale Multimodal Models | Nov 27, 2024 | DecoderImage Captioning | —Unverified | 0 |
| Measuring similarity between embedding spaces using induced neighborhood graphs | Nov 13, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics | Nov 11, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Asterisk*: Keep it Simple | Nov 8, 2024 | ClassificationKnowledge Distillation | —Unverified | 0 |
| Enhancing Visual Classification using Comparative Descriptors | Nov 8, 2024 | Classificationzero-shot-classification | CodeCode Available | 0 |
| ResiDual Transformer Alignment with Spectral Decomposition | Oct 31, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Active Learning for Vision-Language Models | Oct 29, 2024 | Active Learningimage-classification | —Unverified | 0 |
| Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection | Oct 28, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models | Oct 24, 2024 | ClassificationIn-Context Learning | —Unverified | 0 |
| MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Oct 21, 2024 | DiagnosticMedical Diagnosis | CodeCode Available | 0 |
| Assessing Open-world Forgetting in Generative Image Model Customization | Oct 18, 2024 | Image Generationzero-shot-classification | —Unverified | 0 |
| Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? | Oct 17, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| LLM Chain Ensembles for Scalable and Accurate Data Annotation | Oct 16, 2024 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning | Oct 15, 2024 | Image-text RetrievalText Retrieval | —Unverified | 0 |
| A Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks | Oct 10, 2024 | FairnessImage Captioning | CodeCode Available | 0 |
| GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models | Oct 8, 2024 | zero-shot-classificationZero-Shot Learning | CodeCode Available | 0 |
| Improving Predictor Reliability with Selective Recalibration | Oct 7, 2024 | zero-shot-classificationZero-Shot Learning | —Unverified | 0 |
| An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos | Oct 3, 2024 | ClassificationGesture Recognition | —Unverified | 0 |
| Toward a Holistic Evaluation of Robustness in CLIP Models | Oct 2, 2024 | ClassificationOut-of-Distribution Detection | —Unverified | 0 |
| NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models | Oct 1, 2024 | Contrastive LearningEEG | CodeCode Available | 0 |
| Zero-Shot Classification of Crisis Tweets Using Instruction-Finetuned Large Language Models | Sep 30, 2024 | ClassificationDisaster Response | —Unverified | 0 |