| A multi-stage augmented multimodal interaction network for fish feeding intensity quantification | Jun 17, 2025 | Decision Makingmultimodal interaction | —Unverified | 0 |
| InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback | May 29, 2025 | multimodal interaction | —Unverified | 0 |
| ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding | May 25, 2025 | Chart UnderstandingLogical Reasoning | CodeCode Available | 0 |
| DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems | May 11, 2025 | Gesture Recognitionmultimodal interaction | —Unverified | 0 |
| A Survey of Interactive Generative Video | Apr 30, 2025 | Autonomous Drivingmultimodal interaction | —Unverified | 0 |
| Immersive Multimedia Communication: State-of-the-Art on eXtended Reality Streaming | Mar 27, 2025 | multimodal interaction | —Unverified | 0 |
| ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction Rewriting | Feb 20, 2025 | Image Captioningmultimodal interaction | —Unverified | 0 |
| Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving | Feb 12, 2025 | Mathmultimodal interaction | —Unverified | 0 |
| Towards Explainable Multimodal Depression Recognition for Clinical Interviews | Jan 27, 2025 | Decision MakingDepression Detection | CodeCode Available | 0 |
| FGU3R: Fine-Grained Fusion via Unified 3D Representation for Multimodal 3D Object Detection | Jan 8, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |