SOTAVerified

cross-modal alignment

Papers

Showing 110 of 342 papers

TitleStatusHype
Skywork-R1V3 Technical ReportCode7
Phantom: Subject-consistent video generation via cross-modal alignmentCode5
CrossOver: 3D Scene Cross-Modal AlignmentCode3
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and ImagesCode3
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object DetectionCode3
Ola: Pushing the Frontiers of Omni-Modal Language ModelCode3
Flash-VStream: Efficient Real-Time Understanding for Long Video StreamsCode3
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented GenerationCode3
Flash-VStream: Memory-Based Real-Time Understanding for Long Video StreamsCode3
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical ReasoningCode3
Show:102550
← PrevPage 1 of 35Next →

No leaderboard results yet.