| HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification | May 22, 2025 | speaker-diarizationSpeaker Diarization | —Unverified | 0 |
| Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio | May 15, 2025 | Speaker Identificationspeech-recognition | —Unverified | 0 |
| Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues | Apr 21, 2025 | BenchmarkingSpeaker Identification | —Unverified | 0 |
| From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification | Apr 21, 2025 | Data AugmentationSpeaker Identification | —Unverified | 0 |
| Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization | Feb 18, 2025 | Automatic Speech RecognitionSpeaker Identification | —Unverified | 0 |
| A Preliminary Exploration with GPT-4o Voice Mode | Feb 14, 2025 | Age ClassificationAudio Deepfake Detection | —Unverified | 0 |
| SCDiar: a streaming diarization system based on speaker change detection and speech recognition | Jan 28, 2025 | Change Detectionspeaker-diarization | —Unverified | 0 |
| Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models | Jan 24, 2025 | Emotion ClassificationSpeaker Identification | —Unverified | 0 |
| PolInterviews -- A Dataset of German Politician Public Broadcast Interviews | Jan 8, 2025 | Speaker Identification | —Unverified | 0 |
| Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding | Dec 23, 2024 | Speaker Identification | CodeCode Available | 0 |
| Machine Unlearning reveals that the Gender-based Violence Victim Condition can be detected from Speech in a Speaker-Agnostic Setting | Nov 27, 2024 | Machine UnlearningSpeaker Identification | —Unverified | 0 |
| Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network | Nov 22, 2024 | Data AugmentationSpeaker Identification | CodeCode Available | 0 |
| Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications | Nov 20, 2024 | Emotion RecognitionSpeaker Identification | —Unverified | 0 |
| Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments | Oct 7, 2024 | Speaker Identificationspeech-recognition | —Unverified | 0 |
| Exploring VQ-VAE with Prosody Parameters for Speaker Anonymization | Sep 24, 2024 | DecoderSpeaker anonymization | —Unverified | 0 |
| Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample | Sep 24, 2024 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| How Redundant Is the Transformer Stack in Speech Representation Models? | Sep 10, 2024 | Knowledge DistillationSpeaker Identification | —Unverified | 0 |
| A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR | Sep 9, 2024 | Automatic Speech Recognitionspeaker-diarization | —Unverified | 0 |
| Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue | Sep 7, 2024 | Question AnsweringSpeaker Identification | CodeCode Available | 0 |
| Progressive Residual Extraction based Pre-training for Speech Representation Learning | Aug 31, 2024 | Emotion RecognitionRepresentation Learning | —Unverified | 0 |
| Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance Evaluation | Aug 13, 2024 | Speaker Identification | CodeCode Available | 0 |
| Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models | Jul 16, 2024 | AttributeSpeaker Identification | CodeCode Available | 0 |
| DASB -- Discrete Audio and Speech Benchmark | Jun 20, 2024 | BenchmarkingEmotion Recognition | —Unverified | 0 |
| Evaluating Speaker Identity Coding in Self-supervised Models and Humans | Jun 14, 2024 | Speaker Identification | —Unverified | 0 |
| TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches | Apr 18, 2024 | Age EstimationClassification | —Unverified | 0 |
| Masked Modeling Duo: Towards a Universal Audio Pre-training Framework | Apr 9, 2024 | Audio Classification | CodeCode Available | 0 |
| Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling | Apr 1, 2024 | Speaker IdentificationSpeech Synthesis | —Unverified | 0 |
| Hearing-Loss Compensation Using Deep Neural Networks: A Framework and Results From a Listening Test | Mar 15, 2024 | Music ClassificationSpeaker Identification | —Unverified | 0 |
| A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement | Mar 3, 2024 | Automatic Speech RecognitionKeyword Spotting | —Unverified | 0 |
| Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification | Feb 29, 2024 | Adversarial AttackClassification | —Unverified | 0 |
| Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods | Feb 26, 2024 | Speaker Identification | —Unverified | 0 |
| Significance of Chirp MFCC as a Feature in Speech and Audio Applications | Feb 19, 2024 | Music ClassificationSpeaker Identification | —Unverified | 0 |
| Probing Self-supervised Learning Models with Target Speech Extraction | Feb 17, 2024 | Self-Supervised LearningSpeaker Identification | —Unverified | 0 |
| Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis | Feb 11, 2024 | RhythmSpeaker Identification | —Unverified | 0 |
| Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models | Jan 23, 2024 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| SIG: Speaker Identification in Literature via Prompt-Based Generation | Dec 22, 2023 | Speaker Identification | CodeCode Available | 0 |
| Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices | Dec 20, 2023 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| Efficiency-oriented approaches for self-supervised speech representation learning | Dec 18, 2023 | Automatic Speech RecognitionRepresentation Learning | —Unverified | 0 |
| Privacy-preserving Representation Learning for Speech Understanding | Oct 26, 2023 | ClassificationEmotion Recognition | —Unverified | 0 |
| Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition | Oct 17, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis | Oct 16, 2023 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| Test-Time Training for Speech | Sep 19, 2023 | parameter-efficient fine-tuningSpeaker Identification | —Unverified | 0 |
| Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks | Sep 18, 2023 | Keyword SpottingSpeaker Identification | —Unverified | 0 |
| Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction | Sep 7, 2023 | Keyword SpottingSelf-Supervised Learning | —Unverified | 0 |
| An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification | Aug 22, 2023 | Self-Supervised LearningSpeaker Identification | CodeCode Available | 0 |
| Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment | Jul 6, 2023 | Speaker Identificationspeech-recognition | CodeCode Available | 0 |
| Read, Look or Listen? What's Needed for Solving a Multimodal Dataset | Jul 6, 2023 | Question AnsweringSpeaker Identification | —Unverified | 0 |
| VoxWatch: An open-set speaker recognition benchmark on VoxCeleb | Jun 30, 2023 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition | Jun 1, 2023 | Meta-LearningSpeaker Identification | —Unverified | 0 |
| Few-Shot Speaker Identification Using Lightweight Prototypical Network with Feature Grouping and Interaction | May 31, 2023 | Speaker Identification | —Unverified | 0 |