| Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments | Oct 7, 2024 | Speaker Identificationspeech-recognition | —Unverified | 0 |
| Disentangling Textual and Acoustic Features of Neural Speech Representations | Oct 3, 2024 | DisentanglementEmotion Recognition | CodeCode Available | 1 |
| Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample | Sep 24, 2024 | Speaker IdentificationSpeaker Recognition | —Unverified | 0 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| Exploring VQ-VAE with Prosody Parameters for Speaker Anonymization | Sep 24, 2024 | DecoderSpeaker anonymization | —Unverified | 0 |
| How Redundant Is the Transformer Stack in Speech Representation Models? | Sep 10, 2024 | Knowledge DistillationSpeaker Identification | —Unverified | 0 |
| A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR | Sep 9, 2024 | Automatic Speech Recognitionspeaker-diarization | —Unverified | 0 |
| Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue | Sep 7, 2024 | Question AnsweringSpeaker Identification | CodeCode Available | 0 |
| Progressive Residual Extraction based Pre-training for Speech Representation Learning | Aug 31, 2024 | Emotion RecognitionRepresentation Learning | —Unverified | 0 |
| Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance Evaluation | Aug 13, 2024 | Speaker Identification | CodeCode Available | 0 |