| SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding | May 22, 2025 | Action ClassificationAutomatic Speech Recognition | CodeCode Available | 0 |
| Word Level Timestamp Generation for Automatic Speech Recognition and Translation | May 21, 2025 | Automatic Speech Recognitionautomatic-speech-translation | —Unverified | 0 |
| From Weak Labels to Strong Results: Utilizing 5,000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data | May 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs | May 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties | May 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages | May 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English | May 20, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR | May 19, 2025 | Automatic Speech RecognitionGraph Matching | —Unverified | 0 |
| Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down | May 19, 2025 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025 | May 19, 2025 | Automatic Speech RecognitionInstruction Following | —Unverified | 0 |
| LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models | May 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Automatic Speech Recognition for African Low-Resource Languages: Challenges and Future Directions | May 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors | May 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio | May 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Multi-Stage Speaker Diarization for Noisy Classrooms | May 16, 2025 | Action DetectionActivity Detection | CodeCode Available | 0 |
| ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems | May 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Remote Rowhammer Attack using Adversarial Observations on Federated Learning Clients | May 9, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations | May 8, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech | May 6, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation | May 6, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transfer Learning-Based Deep Residual Learning for Speech Recognition in Clean and Noisy Environments | May 2, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition | Apr 30, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction | Apr 30, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models | Apr 21, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides | Apr 21, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Acoustic to Articulatory Inversion of Speech; Data Driven Approaches, Challenges, Applications, and Future Scope | Apr 17, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning | Apr 16, 2025 | Arabic Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| Spatial Audio Processing with Large Language Model on Wearable Devices | Apr 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Visual-Aware Speech Recognition for Noisy Scenarios | Apr 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | —Unverified | 0 |
| DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation | Apr 7, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect | Apr 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Chain of Correction for Full-text Speech Recognition with Large Language Models | Apr 2, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems | Apr 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR | Mar 30, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| VALLR: Visual ASR Language Model for Lip Reading | Mar 27, 2025 | Automatic Speech RecognitionLanguage Modeling | —Unverified | 0 |
| FinAudio: A Benchmark for Audio Large Language Models in Financial Applications | Mar 26, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation Optimization | Mar 25, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whispering in Amharic: Fine-tuning Whisper for Low-resource Language | Mar 24, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication | Mar 21, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces | Mar 19, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Halving transcription time: A fast, user-friendly and GDPR-compliant workflow to create AI-assisted transcripts for content analysis | Mar 17, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA | Mar 13, 2025 | Automatic Speech Recognitionparameter-efficient fine-tuning | —Unverified | 0 |
| ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization | Mar 12, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Everything Can Be Described in Words: A Simple Unified Multi-Modal Framework with Semantic and Temporal Alignment | Mar 12, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR | Mar 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Building English ASR model with regional language support | Mar 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Automatic Speech Recognition for Non-Native English: Accuracy and Disfluency Handling | Mar 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment | Mar 6, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Qieemo: Speech Is All You Need in the Emotion Recognition in Conversations | Mar 5, 2025 | AllAutomatic Speech Recognition | —Unverified | 0 |
| Direct Speech to Speech Translation: A Review | Mar 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |