| Exploring Gender Disparities in Automatic Speech Recognition Technology | Feb 25, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM | Feb 24, 2025 | Automatic Speech RecognitionLanguage Modeling | —Unverified | 0 |
| Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation | Feb 24, 2025 | Automatic Speech RecognitionDiversity | —Unverified | 0 |
| Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration | Feb 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages | Feb 21, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders | Feb 21, 2025 | Audio captioningAutomatic Speech Recognition | —Unverified | 0 |
| WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models | Feb 20, 2025 | Automatic Speech RecognitionRAG | —Unverified | 0 |
| Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks | Feb 19, 2025 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| Adopting Whisper for Confidence Estimation | Feb 19, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models | Feb 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization | Feb 18, 2025 | Automatic Speech RecognitionSpeaker Identification | —Unverified | 0 |
| Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics | Feb 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders | Feb 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities | Feb 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge | Feb 14, 2025 | Action DetectionActivity Detection | —Unverified | 0 |
| MTLM: Incorporating Bidirectional Text Information to Enhance Language Model Training in Speech Recognition Systems | Feb 14, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors | Feb 12, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR Identification | Feb 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models | Feb 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance | Feb 7, 2025 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance | Feb 7, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers | Feb 6, 2025 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond | Feb 6, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling | Feb 5, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition | Feb 3, 2025 | Automatic Speech Recognitionbackdoor defense | —Unverified | 0 |
| CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition | Feb 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport | Feb 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Sagalee: an Open Source Automatic Speech Recognition Dataset for Oromo Language | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Language Bias in Self-Supervised Learning For Automatic Speech Recognition | Jan 31, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions | Jan 31, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition | Jan 29, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Classification Error Bound for Low Bayes Error Conditions in Machine Learning | Jan 27, 2025 | Automatic Speech RecognitionClassification | —Unverified | 0 |
| SEAL: Speech Embedding Alignment Learning for Speech Large Language Model with Retrieval-Augmented Generation | Jan 26, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders? | Jan 25, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech Translation Refinement using Large Language Models | Jan 25, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration | Jan 24, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 5 |
| LoCoML: A Framework for Real-World ML Inference Pipelines | Jan 24, 2025 | Automatic Speech RecognitionMachine Translation | —Unverified | 0 |
| Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing | Jan 23, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| FlanEC: Exploring Flan-T5 for Post-ASR Error Correction | Jan 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions | Jan 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio | Jan 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges | Jan 20, 2025 | Automatic Speech RecognitionDiversity | —Unverified | 0 |
| GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems | Jan 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Benchmark of French ASR Systems Based on Error Severity | Jan 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR | Jan 17, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Automatic Speech Recognition for Sanskrit with Transfer Learning | Jan 17, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition | Jan 16, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| PIER: A Novel Metric for Evaluating What Matters in Code-Switching | Jan 16, 2025 | Automatic Speech RecognitionDecoder | CodeCode Available | 0 |