| ActionFormer: Localizing Moments of Actions with Transformers | Feb 16, 2022 | Action LocalizationAction Recognition | CodeCode Available | 2 |
| Positive Sample Propagation along the Audio-Visual Event Line | Apr 1, 2021 | audio-visual event localization | CodeCode Available | 1 |
| Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline | Mar 22, 2023 | audio-visual event localization | CodeCode Available | 1 |
| Audio-Visual Event Localization in Unconstrained Videos | Mar 23, 2018 | audio-visual event localizationTemporal Localization | CodeCode Available | 1 |
| MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing | Nov 24, 2021 | audio-visual event localizationVideo Understanding | CodeCode Available | 1 |
| Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser | May 27, 2023 | audio-visual event localizationaudio-visual learning | CodeCode Available | 1 |
| Dual-modality seq2seq network for audio-visual event localization | Feb 20, 2019 | audio-visual event localization | CodeCode Available | 1 |
| Cross-Modal Background Suppression for Audio-Visual Event Localization | Jan 1, 2022 | audio-visual event localization | CodeCode Available | 1 |
| Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration | Dec 17, 2024 | audio-visual event localizationaudio-visual learning | CodeCode Available | 1 |
| Towards Open-Vocabulary Audio-Visual Event Localization | Nov 18, 2024 | audio-visual event localization | CodeCode Available | 1 |