| DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement | Aug 14, 2024 | Automatic Speech RecognitionSpeech Enhancement | —Unverified | 0 |
| Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation | Aug 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance | Aug 12, 2024 | Acoustic Scene ClassificationAutomatic Speech Recognition | —Unverified | 0 |
| Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning | Aug 12, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing | Aug 11, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text | Aug 10, 2024 | Automatic Speech RecognitionHallucination | —Unverified | 0 |
| HydraFormer: One Encoder For All Subsampling Rates | Aug 8, 2024 | AllAutomatic Speech Recognition | CodeCode Available | 0 |
| Preserving spoken content in voice anonymisation with character-level vocoder conditioning | Aug 8, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical Expressions into LaTeX Formulas for Improved Readability | Aug 7, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval | Aug 6, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |