| DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement | Aug 14, 2024 | Automatic Speech RecognitionSpeech Enhancement | —Unverified | 0 |
| Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation | Aug 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance | Aug 12, 2024 | Acoustic Scene ClassificationAutomatic Speech Recognition | —Unverified | 0 |
| Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning | Aug 12, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing | Aug 11, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text | Aug 10, 2024 | Automatic Speech RecognitionHallucination | —Unverified | 0 |
| HydraFormer: One Encoder For All Subsampling Rates | Aug 8, 2024 | AllAutomatic Speech Recognition | CodeCode Available | 0 |
| Preserving spoken content in voice anonymisation with character-level vocoder conditioning | Aug 8, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical Expressions into LaTeX Formulas for Improved Readability | Aug 7, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval | Aug 6, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Self-Supervised Learning for Multi-Channel Neural Transducer | Aug 6, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion | Aug 5, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data | Aug 1, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation | Aug 1, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition | Jul 31, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Towards interfacing large language models with ASR systems using confidence measures and prompting | Jul 31, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Leveraging Self-Supervised Models for Automatic Whispered Speech Recognition | Jul 30, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses | Jul 26, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures | Jul 25, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions | Jul 25, 2024 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| Scaling A Simple Approach to Zero-Shot Speech Recognition | Jul 25, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives | Jul 24, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization | Jul 23, 2024 | Automatic Speech RecognitionDistant Speech Recognition | —Unverified | 0 |
| Quantifying the Role of Textual Predictability in Automatic Speech Recognition | Jul 23, 2024 | AttributeAutomatic Speech Recognition | —Unverified | 0 |
| Trading Devil Final: Backdoor attack via Stock market and Bayesian Optimization | Jul 21, 2024 | Automatic Speech RecognitionBackdoor Attack | —Unverified | 0 |