SOTAVerified

cross-modal alignment

Papers

Showing 191200 of 342 papers

TitleStatusHype
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language ModelsCode2
Transcending Fusion: A Multi-Scale Alignment Method for Remote Sensing Image-Text RetrievalCode1
Seeing the Image: Prioritizing Visual Correlation by Contrastive AlignmentCode2
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All0
Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report GenerationCode1
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability0
Context-Enhanced Video Moment Retrieval with Large Language Models0
Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report GenerationCode1
Listen Then See: Video Alignment with Speaker AttentionCode0
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual GroundingCode2
Show:102550
← PrevPage 20 of 35Next →

No leaderboard results yet.