SOTAVerified

cross-modal alignment

Papers

Showing 126150 of 342 papers

TitleStatusHype
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data0
Does Vision Accelerate Hierarchical Generalization in Neural Language Learners?0
CIRP: Cross-Item Relational Pre-training for Multimodal Product Bundling0
Disentangled Noisy Correspondence Learning0
Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment0
Language Model Mapping in Multimodal Music Learning: A Grand Challenge Proposal0
ChartAdapter: Large Vision-Language Model for Chart Summarization0
DiSa: Directional Saliency-Aware Prompt Learning for Generalizable Vision-Language Models0
CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection0
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment0
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation0
DF-Calib: Targetless LiDAR-Camera Calibration via Depth Flow0
A Multi-Agent Framework for Automated Qinqiang Opera Script Generation Using Large Language Models0
Detection-based Intermediate Supervision for Visual Question Answering0
CATVis: Context-Aware Thought Visualization0
Intriguing Properties of Large Language and Vision Models0
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding0
Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing0
ALN-P3: Unified Language Alignment for Perception, Prediction, and Planning in Autonomous Driving0
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model0
Towards Brain Passage Retrieval -- An Investigation of EEG Query Representations0
Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model0
JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation0
LangBridge: Interpreting Image as a Combination of Language Embeddings0
DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation0
Show:102550
← PrevPage 6 of 14Next →

No leaderboard results yet.