SOTAVerified

audio-visual learning

Papers

Showing 1120 of 38 papers

TitleStatusHype
Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization0
Towards Emotion Analysis in Short-form Videos: A Large-Scale Dataset and BaselineCode1
Boosting Audio-visual Zero-shot Learning with Large Language ModelsCode0
Can CLIP Help Sound Source Localization?Code1
Deep Video Inpainting Guided by Audio-Visual Self-SupervisionCode0
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation ModelsCode1
Class-Incremental Grouping Network for Continual Audio-Visual LearningCode1
Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning0
RealImpact: A Dataset of Impact Sound Fields for Real Objects0
A Unified Audio-Visual Learning Framework for Localization, Separation, and RecognitionCode1
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.