| CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization | May 6, 2025 | Active Speaker DetectionAudio-Visual Speech Recognition | CodeCode Available | 2 |
| Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection | Jul 14, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Active Speakers in Context | May 20, 2020 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies | Feb 20, 2024 | Active Speaker Detection | CodeCode Available | 1 |
| A Light Weight Model for Active Speaker Detection | Mar 8, 2023 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild | Jun 7, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| GestSync: Determining who is speaking without a talking head | Oct 8, 2023 | Active Speaker DetectionGesture Synchronization | CodeCode Available | 1 |
| AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection | Jan 5, 2019 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection | Dec 1, 2022 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| LASER: Lip Landmark Assisted Speaker Detection for Robustness | Jan 21, 2025 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |