| Large Language Model Should Understand Pinyin for Chinese ASR Error Correction | Sep 20, 2024 | Automatic Speech RecognitionLanguage Modeling | —Unverified | 0 |
| A Multimodal Dense Retrieval Approach for Speech-Based Open-Domain Question Answering | Sep 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Personalized Speech Recognition for Children with Test-Time Adaptation | Sep 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space | Sep 19, 2024 | Automatic Speech RecognitionData Augmentation | —Unverified | 0 |
| Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition | Sep 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR | Sep 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Large Language Models are Strong Audio-Visual Speech Recognition Learners | Sep 18, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| ASR Benchmarking: Need for a More Representative Conversational Dataset | Sep 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses | Sep 17, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| Chain-of-Thought Prompting for Speech Translation | Sep 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |