SOTAVerified

cross-modal alignment

Papers

Showing 161170 of 342 papers

TitleStatusHype
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking0
FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs0
SViQA: A Unified Speech-Vision Multimodal Model for Textless Visual Question Answering0
CADFormer: Fine-Grained Cross-modal Alignment and Decoding Transformer for Referring Remote Sensing Image Segmentation0
NeuroLIP: Interpretable and Fair Cross-Modal Alignment of fMRI and Phenotypic Text0
GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations0
AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction0
LangBridge: Interpreting Image as a Combination of Language Embeddings0
Shushing! Let's Imagine an Authentic Speech from the Silent Video0
Language-based Image Colorization: A Benchmark and BeyondCode0
Show:102550
← PrevPage 17 of 35Next →

No leaderboard results yet.