| A Non-autoregressive Model for Joint STT and TTS | Jan 15, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| persoDA: Personalized Data Augmentation for Personalized ASR | Jan 15, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications | Jan 14, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Selective Attention Merging for low resource tasks: A case study of Child ASR | Jan 14, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| AdaCS: Adaptive Normalization for Enhanced Code-Switching ASR | Jan 13, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding | Jan 13, 2025 | Automatic Speech Recognitionintent-classification | CodeCode Available | 0 |
| Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives | Jan 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Survey on Spoken Italian Datasets and Corpora | Jan 11, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Discrete Speech Unit Extraction via Independent Component Analysis | Jan 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI | Jan 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Universal-2-TF: Robust All-Neural Text Formatting for ASR | Jan 10, 2025 | AllAutomatic Speech Recognition | —Unverified | 0 |
| Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding | Jan 10, 2025 | Automatic Speech RecognitionClassification | CodeCode Available | 0 |
| Benchmarking Rotary Position Embeddings for Automatic Speech Recognition | Jan 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing | Jan 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Deep Learning for Pathological Speech: A Survey | Jan 7, 2025 | Automatic Speech RecognitionData Augmentation | —Unverified | 0 |
| Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection | Jan 7, 2025 | Action DetectionActivity Detection | —Unverified | 0 |
| Samba-ASR: State-Of-The-Art Speech Recognition Leveraging Structured State-Space Models | Jan 6, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition | Jan 3, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 |
| Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer | Jan 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models | Jan 2, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Automatic Text Pronunciation Correlation Generation and Application for Contextual Biasing | Jan 1, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech Recognition | Jan 1, 2025 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale | Jan 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages | Dec 31, 2024 | Automatic Speech RecognitionData Augmentation | —Unverified | 0 |
| Fotheidil: an Automatic Transcription System for the Irish Language | Dec 31, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization | Dec 27, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization | Dec 26, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Zero-resource Speech Translation and Recognition with LLMs | Dec 24, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| UME: Upcycling Mixture-of-Experts for Scalable and Efficient Automatic Speech Recognition | Dec 23, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition | Dec 21, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding | Dec 21, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Enhancing Multilingual ASR for Unseen Languages via Language Embedding Modeling | Dec 21, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech Retrieval-Augmented Generation without Automatic Speech Recognition | Dec 21, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| TouchASP: Elastic Automatic Speech Perception that Everyone Can Touch | Dec 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration | Dec 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transcribing and Translating, Fast and Slow: Joint Speech Translation and Recognition | Dec 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speak & Improve Corpus 2025: an L2 English Speech Corpus for Language Assessment and Feedback | Dec 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speak & Improve Challenge 2025: Tasks and Baseline Systems | Dec 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Transliterated Zero-Shot Domain Adaptation for Automatic Speech Recognition | Dec 15, 2024 | Automatic Speech RecognitionDomain Adaptation | —Unverified | 0 |
| Efficient Adaptation of Multilingual Models for Japanese ASR | Dec 14, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| Bilevel Joint Unsupervised and Supervised Training for Automatic Speech Recognition | Dec 11, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations Generation | Dec 11, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Harnessing Transfer Learning from Swahili: Advancing Solutions for Comorian Dialects | Dec 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease Detection | Dec 9, 2024 | AllAlzheimer's Disease Detection | —Unverified | 0 |
| Effective Text Adaptation for LLM-based ASR through Soft Prompt Fine-Tuning | Dec 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Leveraging Prompt Learning and Pause Encoding for Alzheimer's Disease Detection | Dec 9, 2024 | Alzheimer's Disease DetectionAutomatic Speech Recognition | —Unverified | 0 |
| SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR | Dec 7, 2024 | Automatic Speech RecognitionData Augmentation | CodeCode Available | 0 |
| Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding | Dec 5, 2024 | Audio GenerationAutomatic Speech Recognition | —Unverified | 0 |
| ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction | Dec 4, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario | Dec 1, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |