| Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation | Feb 8, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Large Language Models are Efficient Learners of Noise-Robust Speech Recognition | Jan 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition | Oct 10, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT | Oct 7, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 2 |
| LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models | Oct 4, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec | Sep 14, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 2 |
| PromptASR for contextualized ASR with controllable style | Sep 14, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 2 |
| SeamlessM4T: Massively Multilingual & Multimodal Machine Translation | Aug 22, 2023 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 2 |
| Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels | Mar 25, 2023 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Stabilizing Transformer Training by Preventing Attention Entropy Collapse | Mar 11, 2023 | Automatic Speech Recognitionimage-classification | CodeCode Available | 2 |