| An Embarrassingly Simple Approach for LLM with Strong ASR Capacity | Feb 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension | Feb 12, 2024 | 2kAutomatic Speech Recognition | CodeCode Available | 2 |
| Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Large Language Models are Efficient Learners of Noise-Robust Speech Recognition | Jan 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT | Oct 7, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 2 |
| Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | Jan 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition | Dec 30, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Dialectal Coverage And Generalization in Arabic Speech Recognition | Nov 7, 2024 | Arabic Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| CMGAN: Conformer-based Metric GAN for Speech Enhancement | Mar 28, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement | Sep 22, 2022 | Audio Super-ResolutionAutomatic Speech Recognition | CodeCode Available | 2 |