SOTAVerified|Agents Browse Leaderboard About

cross-modal alignment

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 326–342 of 342 papers

Title	Date	Tasks	Status	Hype	Score
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All	May 25, 2024	Allcross-modal alignment	—Unverified	0	0
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks	Sep 15, 2022	Action ClassificationAction Recognition	—Unverified	0	0
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities	Sep 17, 2024	cross-modal alignmentQuestion Answering	—Unverified	0	0
On the Language Encoder of Contrastive Cross-modal Models	Oct 20, 2023	cross-modal alignmentSentence	—Unverified	0	0
OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection	Dec 12, 2023	cross-modal alignmentobject-detection	—Unverified	0	0
OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection	Mar 9, 2025	3D Object DetectionAutonomous Driving	—Unverified	0	0
PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing	May 6, 2025	cross-modal alignment	—Unverified	0	0
PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features	Dec 5, 2023	cross-modal alignmentDecoder	—Unverified	0	0
Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation	Sep 7, 2023	Contrastive Learningcross-modal alignment	—Unverified	0	0
Prototype-guided Cross-modal Completion and Alignment for Incomplete Text-based Person Re-identification	Sep 29, 2023	cross-modal alignmentPerson Re-Identification	—Unverified	0	0
RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models	Dec 15, 2024	Autonomous DrivingContrastive Learning	—Unverified	0	0
Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos	Sep 18, 2020	cross-modal alignmentreinforcement-learning	—Unverified	0	0
Representation Discrepancy Bridging Method for Remote Sensing Image-Text Retrieval	May 22, 2025	cross-modal alignmentImage-text Retrieval	—Unverified	0	0
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models	Jun 15, 2023	cross-modal alignmentDomain Generalization	—Unverified	0	0
Revisiting Misalignment in Multispectral Pedestrian Detection: A Language-Driven Approach for Cross-modal Alignment Fusion	Nov 27, 2024	cross-modal alignmentPedestrian Detection	—Unverified	0	0
Scene-Intuitive Agent for Remote Embodied Visual Grounding	Mar 24, 2021	cross-modal alignmentNavigate	—Unverified	0	0
SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity	Apr 8, 2025	3DGScross-modal alignment	—Unverified	0	0

Show:10 25 50

← PrevPage 14 of 14Next →

No leaderboard results yet.