| Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework | Jun 9, 2025 | audio-visual learningDeepFake Detection | —Unverified | 0 | 0 |
| Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization | Jan 16, 2024 | Action DetectionActivity Detection | —Unverified | 0 | 0 |
| Object Segmentation with Audio Context | Jan 4, 2023 | audio-visual learningDecoder | —Unverified | 0 | 0 |
| RealImpact: A Dataset of Impact Sound Fields for Real Objects | Jun 16, 2023 | audio-visual learning | —Unverified | 0 | 0 |
| Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives | Feb 17, 2025 | Adversarial Robustnessaudio-visual learning | —Unverified | 0 | 0 |
| Sequential Contrastive Audio-Visual Learning | Jul 8, 2024 | audio-visual learningContrastive Learning | —Unverified | 0 | 0 |
| Telling Left from Right: Learning Spatial Correspondence of Sight and Sound | Jun 11, 2020 | audio-visual learning | —Unverified | 0 | 0 |
| Unveiling Visual Biases in Audio-Visual Localization Benchmarks | Aug 25, 2024 | audio-visual learningVisual Localization | —Unverified | 0 | 0 |