| CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization | May 6, 2025 | Active Speaker DetectionAudio-Visual Speech Recognition | CodeCode Available | 2 |
| AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection | Jan 5, 2019 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Active Speakers in Context | May 20, 2020 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning | Sep 21, 2023 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Unsupervised active speaker detection in media content using cross-modal information | Sep 24, 2022 | Active Speaker Detection | CodeCode Available | 1 |
| GestSync: Determining who is speaking without a talking head | Oct 8, 2023 | Active Speaker DetectionGesture Synchronization | CodeCode Available | 1 |
| How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild | Jun 7, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios | May 28, 2025 | Active Speaker Detection | CodeCode Available | 1 |
| Self-Supervised Learning of Audio-Visual Objects from Video | Aug 10, 2020 | Active Speaker DetectionFace Detection | CodeCode Available | 1 |
| AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies | Feb 20, 2024 | Active Speaker Detection | CodeCode Available | 1 |
| Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection | Jul 14, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| LASER: Lip Landmark Assisted Speaker Detection for Robustness | Jan 21, 2025 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection | Jul 15, 2022 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection | Dec 1, 2022 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| A Light Weight Model for Active Speaker Detection | Mar 8, 2023 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker) | Jun 1, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| WASD: A Wilder Active Speaker Detection Dataset | Mar 9, 2023 | Active Speaker Detection | CodeCode Available | 1 |
| LoCoNet: Long-Short Context Network for Active Speaker Detection | Jan 19, 2023 | Active Speaker DetectionAudio-Visual Active Speaker Detection | CodeCode Available | 1 |
| Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement | Mar 4, 2022 | Active Speaker DetectionMulti-Task Learning | CodeCode Available | 1 |
| Target Active Speaker Detection with Audio-visual Cues | May 22, 2023 | Active Speaker DetectionAudio-Visual Synchronization | CodeCode Available | 1 |
| Look Who's Talking: Active Speaker Detection in the Wild | Aug 17, 2021 | Active Speaker Detection | CodeCode Available | 1 |
| Rethinking Audio-visual Synchronization for Active Speaker Detection | Jun 21, 2022 | Active Speaker DetectionAudio-Visual Synchronization | —Unverified | 0 |
| An Efficient and Streaming Audio Visual Active Speaker Detection System | Sep 13, 2024 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism | Sep 15, 2023 | Active Speaker DetectionEdge-computing | —Unverified | 0 |
| Audio Inputs for Active Speaker Detection and Localization via Microphone Array | Jul 27, 2023 | Active Speaker Detection | —Unverified | 0 |
| Audio-video fusion strategies for active speaker detection in meetings | Jun 9, 2022 | Active Speaker DetectionManagement | —Unverified | 0 |
| Audio-visual child-adult speaker classification in dyadic interactions | Oct 3, 2023 | Active Speaker DetectionClassification | —Unverified | 0 |
| Audio-Visual Talker Localization in Video for Spatial Sound Reproduction | Jun 1, 2024 | Active Speaker Detection | —Unverified | 0 |
| Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection | May 10, 2022 | Active Speaker DetectionAutomatic Speech Recognition | —Unverified | 0 |
| Cross-modal Supervision for Learning Active Speaker Detection in Video | Mar 29, 2016 | Action DetectionActive Speaker Detection | —Unverified | 0 |
| Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function | Oct 26, 2022 | Active Speaker DetectionSound Source Localization | —Unverified | 0 |
| Detection and Analysis of Content Creator Collaborations in YouTube Videos using Face- and Speaker-Recognition | Jul 5, 2018 | Active Speaker DetectionFace Recognition | —Unverified | 0 |
| Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization | Jan 6, 2022 | Action DetectionActive Speaker Detection | —Unverified | 0 |
| End-To-End Audiovisual Feature Fusion for Active Speaker Detection | Jul 27, 2022 | Active Speaker Detection | —Unverified | 0 |
| Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training | Apr 1, 2024 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection | Sep 1, 2021 | Active Speaker Detection | —Unverified | 0 |
| How to Squeeze An Explanation Out of Your Model | Dec 6, 2024 | Active Speaker Detection | —Unverified | 0 |
| ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021 | Jun 1, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization | Oct 14, 2022 | Action DetectionActive Speaker Detection | —Unverified | 0 |
| Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos | Jul 10, 2023 | Active Speaker DetectionAudio Denoising | —Unverified | 0 |
| Learning Spatial-Temporal Graphs for Active Speaker Detection | Dec 2, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| Data standardization for robust lip sync | Feb 13, 2022 | 3D Face ReconstructionActive Speaker Detection | —Unverified | 0 |
| Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection | Oct 3, 2022 | Active Speaker DetectionAdversarial Robustness | —Unverified | 0 |
| Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion | Jun 7, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| Robust Active Speaker Detection in Noisy Environments | Mar 27, 2024 | Active Speaker DetectionSpeech Separation | —Unverified | 0 |
| Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition | Nov 24, 2017 | Active Speaker DetectionLanguage Acquisition | —Unverified | 0 |
| Spot the conversation: speaker diarisation in the wild | Jul 2, 2020 | Active Speaker DetectionSpeaker Verification | —Unverified | 0 |
| Understanding Co-speech Gestures in-the-wild | Mar 28, 2025 | Active Speaker Detection | —Unverified | 0 |
| UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022 | Jun 22, 2022 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |
| UniCon: Unified Context Network for Robust Active Speaker Detection | Aug 5, 2021 | Active Speaker DetectionAudio-Visual Active Speaker Detection | —Unverified | 0 |