| Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models | Sep 21, 2024 | DeepFake DetectionFace Swapping | —Unverified | 0 |
| Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels | Sep 16, 2024 | Speaker RecognitionSpeaker Verification | —Unverified | 0 |
| oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach with Pretrained Models | Sep 16, 2024 | Data AugmentationSpeaker Recognition | —Unverified | 0 |
| Text-To-Speech Synthesis In The Wild | Sep 13, 2024 | BenchmarkingSpeaker Recognition | —Unverified | 0 |
| USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction | Sep 4, 2024 | Speaker RecognitionSpeech Separation | CodeCode Available | 1 |
| Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings | Aug 30, 2024 | speaker-diarizationSpeaker Diarization | —Unverified | 0 |
| The VoxCeleb Speaker Recognition Challenge: A Retrospective | Aug 27, 2024 | Domain AdaptationSpeaker Recognition | —Unverified | 0 |
| Convexity-based Pruning of Speech Representation Models | Aug 16, 2024 | Keyword SpottingSelf-Supervised Learning | —Unverified | 0 |
| Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation | Aug 1, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| VoxSim: A perceptual voice similarity dataset | Jul 26, 2024 | BenchmarkingSpeaker Recognition | CodeCode Available | 1 |
| Reshape Dimensions Network for Speaker Recognition | Jul 25, 2024 | Speaker Recognition | CodeCode Available | 2 |
| Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning | Jul 21, 2024 | Representation LearningSelf-Supervised Learning | —Unverified | 0 |
| Team HYU ASML ROBOVOX SP Cup 2024 System Description | Jul 16, 2024 | Data AugmentationSpeaker Recognition | —Unverified | 0 |
| VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark | Jul 16, 2024 | DiversitySpeaker Identification | CodeCode Available | 5 |
| Phonetic Richness for Improved Automatic Speaker Verification | Jul 10, 2024 | Speaker RecognitionSpeaker Verification | —Unverified | 0 |
| A voice and speech corpus of patients who underwent upper airway surgery in pre- and post-operative states | Jul 9, 2024 | ArticlesClassification | CodeCode Available | 0 |
| Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation | Jul 8, 2024 | Automatic Speech RecognitionEmotion Recognition | —Unverified | 0 |
| We Need Variations in Speech Generation: Sub-center Modelling for Speaker Embeddings | Jul 5, 2024 | Speaker RecognitionSpeech Synthesis | —Unverified | 0 |
| Prosody-Driven Privacy-Preserving Dementia Detection | Jul 3, 2024 | AttributeDiagnostic | CodeCode Available | 0 |
| Open-Source Conversational AI with SpeechBrain 1.0 | Jun 29, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CEC: A Noisy Label Detection Method for Speaker Recognition | Jun 19, 2024 | Speaker RecognitionSpeaker Verification | —Unverified | 0 |
| Challenging margin-based speaker embedding extractors by using the variational information bottleneck | Jun 18, 2024 | Speaker Recognition | —Unverified | 0 |
| PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation | Jun 10, 2024 | Age EstimationEmotion Recognition | —Unverified | 0 |
| The Reasonable Effectiveness of Speaker Embeddings for Violence Detection | Jun 10, 2024 | Speaker Recognition | —Unverified | 0 |
| Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting | May 30, 2024 | Audio SynthesisRepresentation Learning | —Unverified | 0 |