SOTAVerified

multimodal interaction

Papers

Showing 110 of 106 papers

TitleStatusHype
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech ModelCode5
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal InteractionCode4
Segment and Track AnythingCode4
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-ExpertsCode2
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with TransformerCode2
Foundations and Recent Trends in Multimodal Mobile Agents: A SurveyCode2
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete DataCode2
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context InferenceCode2
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You WantCode2
Agent AI: Surveying the Horizons of Multimodal InteractionCode2
Show:102550
← PrevPage 1 of 11Next →

No leaderboard results yet.