| Dialectal Coverage And Generalization in Arabic Speech Recognition | Nov 7, 2024 | Arabic Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 | 5 |
| DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition | Dec 30, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| Large Language Models are Efficient Learners of Noise-Robust Speech Recognition | Jan 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT | Oct 7, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 2 | 5 |
| Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | Jan 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| An Embarrassingly Simple Approach for LLM with Strong ASR Capacity | Feb 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement | Sep 22, 2022 | Audio Super-ResolutionAutomatic Speech Recognition | CodeCode Available | 2 | 5 |
| BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels | Mar 25, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 | 5 |