SOTAVerified

multimodal interaction

Papers

Showing 2650 of 106 papers

TitleStatusHype
Spatio-Temporal 3D Point Clouds from WiFi-CSI Data via Transformer NetworksCode1
Robi Butler: Multimodal Remote Interaction with a Household Robot Assistant0
Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in ConversationsCode1
LLM-Assisted Visual Analytics: Opportunities and Challenges0
RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba0
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory InstructionsCode0
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic0
A Unified Understanding of Adversarial Vulnerability Regarding Unimodal Models and Vision-Language Pre-training Models0
UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language ModelsCode1
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete DataCode2
Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational AgentsCode0
HGNET: A Hierarchical Feature Guided Network for Occupancy Flow Field Prediction0
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language ModelsCode1
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents0
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)0
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context InferenceCode2
EMMI -- Empathic Multimodal Motivational Interviews Dataset: Analyses and Annotations0
Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum0
Narrative Action Evaluation with Prompt-Guided Multimodal InteractionCode1
Cooperative Sentiment Agents for Multimodal Sentiment AnalysisCode1
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You WantCode2
BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI0
Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction0
On the Arrow of Inference0
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification0
Show:102550
← PrevPage 2 of 5Next →

No leaderboard results yet.