| An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS | Jun 25, 2025 | Speaker Recognitiontext-to-speech | —Unverified | 0 |
| A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments | Jun 17, 2025 | DenoisingSpeaker Recognition | —Unverified | 0 |
| CoLMbo: Speaker Language Model for Descriptive Profiling | Jun 11, 2025 | DescriptiveLanguage Modeling | CodeCode Available | 0 |
| Learning Speaker-Invariant Visual Features for Lipreading | Jun 9, 2025 | DisentanglementLipreading | —Unverified | 0 |
| Rhythm Features for Speaker Identification | Jun 7, 2025 | Deep LearningRhythm | —Unverified | 0 |
| Synthetic Speech Source Tracing using Metric Learning | Jun 3, 2025 | Metric LearningSelf-Supervised Learning | —Unverified | 0 |
| LASPA: Language Agnostic Speaker Disentanglement with Prefix-Tuned Cross-Attention | Jun 2, 2025 | AnatomyDisentanglement | —Unverified | 0 |
| Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction | Jun 2, 2025 | Speaker Recognition | —Unverified | 0 |
| Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations | Jun 1, 2025 | Emotion RecognitionRhythm | —Unverified | 0 |
| Pretraining Multi-Speaker Identification for Neural Speaker Diarization | May 30, 2025 | speaker-diarizationSpeaker Diarization | —Unverified | 0 |
| Private kNN-VC: Interpretable Anonymization of Converted Speech | May 23, 2025 | Speaker anonymizationSpeaker Recognition | CodeCode Available | 0 |
| SEED: Speaker Embedding Enhancement Diffusion Model | May 22, 2025 | modelSpeaker Recognition | CodeCode Available | 2 |
| Analysis of ABC Frontend Audio Systems for the NIST-SRE24 | May 21, 2025 | Speaker Recognition | —Unverified | 0 |
| SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker Recognition | Apr 23, 2025 | Speaker Recognition | —Unverified | 0 |
| From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification | Apr 21, 2025 | Data AugmentationSpeaker Identification | —Unverified | 0 |
| Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks | Mar 7, 2025 | Speaker Recognition | —Unverified | 0 |
| Language Modelling for Speaker Diarization in Telephonic Interviews | Jan 28, 2025 | Acoustic ModellingLanguage Modelling | —Unverified | 0 |
| VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition | Dec 31, 2024 | DiversitySpeaker Recognition | —Unverified | 0 |
| Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution | Dec 23, 2024 | Audio Deepfake DetectionDeepFake Detection | —Unverified | 0 |
| Study on Inter and Intra Speaker Variability in Speaker Recognition | Nov 12, 2024 | DiversitySpeaker Recognition | —Unverified | 0 |
| Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks | Oct 16, 2024 | Age EstimationEmotion Recognition | —Unverified | 0 |
| Investigation of Speaker Representation for Target-Speaker Speech Processing | Oct 15, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| The OCON model: an old but green solution for distributable supervised classification for acoustic monitoring in smart cities | Oct 5, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample | Sep 24, 2024 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection | Sep 22, 2024 | Depression DetectionEmotion Recognition | —Unverified | 0 |