| Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy? | Sep 13, 2024 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| Exploring SSL Discrete Tokens for Multilingual ASR | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion Models | Sep 12, 2024 | Adversarial AttackAdversarial Purification | CodeCode Available | 0 |
| WhisperNER: Unified Open Named Entity and Speech Recognition | Sep 12, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| The Faetar Benchmark: Speech Recognition in a Very Under-Resourced Language | Sep 12, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |