SOTAVerified

cross-modal alignment

Papers

Showing 5160 of 342 papers

TitleStatusHype
Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable ModelsCode0
PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing0
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained AlignmentCode1
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report GenerationCode0
A Multi-Agent Framework for Automated Qinqiang Opera Script Generation Using Large Language Models0
Cross-attention for State-based model RWKV-70
TMCIR: Token Merge Benefits Composed Image Retrieval0
InfoMAE: Pair-Efficient Cross-Modal Alignment for Multimodal Time-Series Sensing Signals0
3D CoCa: Contrastive Learners are 3D CaptionersCode0
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering0
Show:102550
← PrevPage 6 of 35Next →

No leaderboard results yet.