| Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events | Jan 1, 2023 | Action LocalizationPathfinder | CodeCode Available | 1 |
| Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge | Mar 26, 2024 | ObjectSound Source Localization | CodeCode Available | 1 |
| Audio-Visual Grouping Network for Sound Localization from Mixtures | Mar 29, 2023 | Object LocalizationSound Source Localization | CodeCode Available | 1 |
| Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment | Jul 18, 2024 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 1 |
| Audio-Visual Instance Segmentation | Oct 28, 2023 | Instance SegmentationSegmentation | CodeCode Available | 1 |
| Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes | Mar 25, 2022 | Contrastive LearningSound Source Localization | CodeCode Available | 1 |
| A Proposal-Based Paradigm for Self-Supervised Sound Source Localization in Videos | Jan 1, 2022 | Multiple Instance LearningSound Source Localization | —Unverified | 0 |
| Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function | Oct 26, 2022 | Active Speaker DetectionSound Source Localization | —Unverified | 0 |
| AcousticFusion: Fusing Sound Source Localization to Visual SLAM in Dynamic Environments | Aug 3, 2021 | Depth EstimationObject | —Unverified | 0 |
| DiffusionRIR: Room Impulse Response Interpolation using Diffusion Models | Apr 29, 2025 | Audio Signal ProcessingData Augmentation | —Unverified | 0 |