Active Speaker Detection

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 63 papers

Title	Date	Tasks	Status	Hype	Score
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization	May 6, 2025	Active Speaker DetectionAudio-Visual Speech Recognition	CodeCode Available	2	5
Look Who's Talking: Active Speaker Detection in the Wild	Aug 17, 2021	Active Speaker Detection	CodeCode Available	1	5
AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies	Feb 20, 2024	Active Speaker Detection	CodeCode Available	1	5
UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios	May 28, 2025	Active Speaker Detection	CodeCode Available	1	5
Unsupervised active speaker detection in media content using cross-modal information	Sep 24, 2022	Active Speaker Detection	CodeCode Available	1	5
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection	Jul 14, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
WASD: A Wilder Active Speaker Detection Dataset	Mar 9, 2023	Active Speaker Detection	CodeCode Available	1	5
LASER: Lip Landmark Assisted Speaker Detection for Robustness	Jan 21, 2025	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
A Light Weight Model for Active Speaker Detection	Mar 8, 2023	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection	Jul 15, 2022	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection	Jan 5, 2019	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
Active Speakers in Context	May 20, 2020	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)	Jun 1, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection	Dec 1, 2022	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
LoCoNet: Long-Short Context Network for Active Speaker Detection	Jan 19, 2023	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
GestSync: Determining who is speaking without a talking head	Oct 8, 2023	Active Speaker DetectionGesture Synchronization	CodeCode Available	1	5
Self-Supervised Learning of Audio-Visual Objects from Video	Aug 10, 2020	Active Speaker DetectionFace Detection	CodeCode Available	1	5
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild	Jun 7, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement	Mar 4, 2022	Active Speaker DetectionMulti-Task Learning	CodeCode Available	1	5
Target Active Speaker Detection with Audio-visual Cues	May 22, 2023	Active Speaker DetectionAudio-Visual Synchronization	CodeCode Available	1	5
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning	Sep 21, 2023	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	1	5
Whose Emotion Matters? Speaking Activity Localisation without Prior Knowledge	Nov 23, 2022	Active Speaker DetectionAutomatic Speech Recognition	CodeCode Available	0	5
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization	Dec 21, 2023	Active Speaker DetectionSelf-Supervised Learning	CodeCode Available	0	5
ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection	Dec 11, 2024	Active Speaker DetectionFeature Importance	CodeCode Available	0	5
MAAS: Multi-modal Assignation for Active Speaker Detection	Jan 11, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	0	5
End-to-End Active Speaker Detection	Mar 27, 2022	Active Speaker DetectionAudio-Visual Active Speaker Detection	CodeCode Available	0	5
FabuLight-ASD: Unveiling Speech Activity via Body Language	Nov 20, 2024	Active Speaker Detection	CodeCode Available	0	5
BIAS: A Body-based Interpretable Active Speaker Approach	Dec 6, 2024	Active Speaker DetectionFeature Importance	CodeCode Available	0	5
Bio-Inspired Modality Fusion for Active Speaker Detection	Feb 28, 2020	Active Speaker Detection	CodeCode Available	0	5
Imitation of human motion achieves natural head movements for humanoid robots in an active-speaker detection task	Jul 16, 2024	Active Speaker Detection	CodeCode Available	0	5
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022	Jun 22, 2022	Active Speaker DetectionAudio-Visual Active Speaker Detection	—Unverified	0	0
UniCon: Unified Context Network for Robust Active Speaker Detection	Aug 5, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	—Unverified	0	0
Visually Supervised Speaker Detection and Localization via Microphone Array	Mar 7, 2022	Active Speaker Detection	—Unverified	0	0
Understanding Co-speech Gestures in-the-wild	Mar 28, 2025	Active Speaker Detection	—Unverified	0	0
An Efficient and Streaming Audio Visual Active Speaker Detection System	Sep 13, 2024	Active Speaker DetectionAudio-Visual Active Speaker Detection	—Unverified	0	0
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism	Sep 15, 2023	Active Speaker DetectionEdge-computing	—Unverified	0	0
Audio Inputs for Active Speaker Detection and Localization via Microphone Array	Jul 27, 2023	Active Speaker Detection	—Unverified	0	0
Audio-video fusion strategies for active speaker detection in meetings	Jun 9, 2022	Active Speaker DetectionManagement	—Unverified	0	0
Audio-visual child-adult speaker classification in dyadic interactions	Oct 3, 2023	Active Speaker DetectionClassification	—Unverified	0	0
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction	Jun 1, 2024	Active Speaker Detection	—Unverified	0	0
Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection	May 10, 2022	Active Speaker DetectionAutomatic Speech Recognition	—Unverified	0	0
Cross-modal Supervision for Learning Active Speaker Detection in Video	Mar 29, 2016	Action DetectionActive Speaker Detection	—Unverified	0	0
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function	Oct 26, 2022	Active Speaker DetectionSound Source Localization	—Unverified	0	0
Detection and Analysis of Content Creator Collaborations in YouTube Videos using Face- and Speaker-Recognition	Jul 5, 2018	Active Speaker DetectionFace Recognition	—Unverified	0	0
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization	Jan 6, 2022	Action DetectionActive Speaker Detection	—Unverified	0	0
End-To-End Audiovisual Feature Fusion for Active Speaker Detection	Jul 27, 2022	Active Speaker Detection	—Unverified	0	0
Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training	Apr 1, 2024	Active Speaker DetectionAudio-Visual Active Speaker Detection	—Unverified	0	0
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection	Sep 1, 2021	Active Speaker Detection	—Unverified	0	0
How to Squeeze An Explanation Out of Your Model	Dec 6, 2024	Active Speaker Detection	—Unverified	0	0
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021	Jun 1, 2021	Active Speaker DetectionAudio-Visual Active Speaker Detection	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 2Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GestSync	Accuracy	87	—	Unverified