| Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play | May 5, 2025 | AI AgentAutomatic Speech Recognition | CodeCode Available | 3 |
| Transfer Learning-Based Deep Residual Learning for Speech Recognition in Clean and Noisy Environments | May 2, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction | Apr 30, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition | Apr 30, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides | Apr 21, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models | Apr 21, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Acoustic to Articulatory Inversion of Speech; Data Driven Approaches, Challenges, Applications, and Future Scope | Apr 17, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning | Apr 16, 2025 | Arabic Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Spatial Audio Processing with Large Language Model on Wearable Devices | Apr 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Visual-Aware Speech Recognition for Noisy Scenarios | Apr 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |