| A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement | Feb 15, 2021 | Speaker IdentificationSpeech Denoising | CodeCode Available | 1 |
| Generative Pre-Training for Speech with Autoregressive Predictive Coding | Oct 23, 2019 | Representation LearningSpeaker Identification | CodeCode Available | 1 |
| Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings | Mar 13, 2025 | Speaker Identificationspeech-recognition | CodeCode Available | 1 |
| Blind Speech Separation and Dereverberation using Neural Beamforming | Mar 24, 2021 | Speaker IdentificationSpeaker Separation | CodeCode Available | 1 |
| Disentangling Textual and Acoustic Features of Neural Speech Representations | Oct 3, 2024 | DisentanglementEmotion Recognition | CodeCode Available | 1 |
| Extended U-Net for Speaker Verification in Noisy Environments | Jun 27, 2022 | DenoisingSpeaker Identification | CodeCode Available | 1 |
| GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding | May 16, 2023 | Speaker Identification | CodeCode Available | 1 |
| FastAudio: A Learnable Audio Front-End for Spoof Speech Detection | Sep 6, 2021 | Speaker IdentificationSpeaker Verification | CodeCode Available | 1 |
| FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances | Nov 17, 2020 | Adversarial AttackSpeaker Identification | CodeCode Available | 1 |
| Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam | Jan 23, 2020 | Speaker IdentificationSpeech Extraction | CodeCode Available | 1 |
| IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages | Aug 24, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding | Jul 4, 2024 | Dialogue Generationobject-detection | CodeCode Available | 1 |
| Masked Autoencoders that Listen | Jul 13, 2022 | Audio ClassificationDecoder | CodeCode Available | 1 |
| MelHuBERT: A simplified HuBERT on Mel spectrograms | Nov 17, 2022 | Automatic Speech RecognitionSelf-Supervised Learning | CodeCode Available | 1 |
| ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification | Nov 23, 2022 | Keyword SpottingSelf-Supervised Learning | CodeCode Available | 1 |
| Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR | Nov 3, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals | Jun 2, 2023 | Depression DetectionDisentanglement | CodeCode Available | 1 |
| Learning Speaker Representations with Mutual Information | Dec 1, 2018 | SentenceSpeaker Identification | CodeCode Available | 1 |
| A user study to compare two conversational assistants designed for people with hearing impairments | Jun 1, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Advances in Online Audio-Visual Meeting Transcription | Dec 10, 2019 | Sound Source Localizationspeaker-diarization | —Unverified | 0 |
| A Multi Level Data Fusion Approach for Speaker Identification on Telephone Speech | Jun 27, 2014 | Speaker Identification | —Unverified | 0 |
| Adaptive blind audio source extraction supervised by dominant speaker identification using x-vectors | Oct 25, 2019 | Speaker Identification | —Unverified | 0 |
| Emirati-Accented Speaker Identification in Stressful Talking Conditions | Sep 28, 2019 | Speaker Identification | —Unverified | 0 |
| Advanced Rich Transcription System for Estonian Speech | Jan 11, 2019 | Speaker Identification | —Unverified | 0 |