Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis May 27, 2025 Accented Speech Recognition Self-Supervised Learning
— Unverified 0Topological Deep Learning for Speech Data May 27, 2025 Deep Learning Phoneme Recognition
— Unverified 0Loquacious Set: 25,000 Hours of Transcribed and Diverse English Speech Recognition Data for Research and Commercial Use May 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0PSRB: A Comprehensive Benchmark for Evaluating Persian ASR Systems May 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0In-context Language Learning for Endangered Languages in Speech Recognition May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages May 26, 2025 Automatic Speech Recognition Diversity
— Unverified 0Robust fine-tuning of speech recognition models via model merging: application to disordered speech May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Beyond Manual Transcripts: The Potential of Automated Speech Recognition Errors in Improving Alzheimer's Disease Detection May 26, 2025 Alzheimer's Disease Detection Automatic Speech Recognition
— Unverified 0Continuous Learning for Children's ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically May 26, 2025 Retrieval speech-recognition
— Unverified 0Exploring Generative Error Correction for Dysarthric Speech Recognition May 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Novel Loss-Enhanced Universal Adversarial Patches for Sustainable Speaker Privacy May 26, 2025 Speaker anonymization speech-recognition
— Unverified 0WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper May 25, 2025 speech-recognition Speech Recognition
— Unverified 0CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASR May 24, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Building a Functional Machine Translation Corpus for Kpelle May 24, 2025 Data Augmentation Language Modelling
— Unverified 0StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos May 24, 2025 Humor Detection speech-recognition
— Unverified 0VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining May 23, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition May 23, 2025 speech-recognition Speech Recognition
— Unverified 0LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context May 23, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training May 23, 2025 Automatic Speech Recognition Emotion Recognition
Code Code Available 11Speechless: Speech Instruction Training Without Speech for Low Resource Languages May 23, 2025 speech-recognition Speech Recognition
Code Code Available 7Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities May 23, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding May 22, 2025 Action Classification Automatic Speech Recognition
Code Code Available 0An Effective Training Framework for Light-Weight Automatic Speech Recognition Models May 22, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Large Language Models based ASR Error Correction for Child Conversations May 22, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition May 22, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Word Level Timestamp Generation for Automatic Speech Recognition and Translation May 21, 2025 Automatic Speech Recognition automatic-speech-translation
Code Code Available 0From Weak Labels to Strong Results: Utilizing 5,000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English May 20, 2025 Automatic Speech Recognition speech-recognition
— Unverified 0In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference May 20, 2025 Quantization speech-recognition
Code Code Available 0Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language May 20, 2025 Multi-Task Learning Sign Language Recognition
Code Code Available 0HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing May 20, 2025 Language Modeling Language Modelling
— Unverified 0The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition May 20, 2025 Audio-Visual Speech Recognition speaker-diarization
— Unverified 0Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach May 20, 2025 Audio-Visual Speech Recognition Mixture-of-Experts
— Unverified 0PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Multi-head Temporal Latent Attention May 19, 2025 GPU speech-recognition
Code Code Available 4Granary: Speech Recognition and Translation Dataset in 25 European Languages May 19, 2025 Hallucination Punctuation Restoration
— Unverified 0Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down May 19, 2025 Automatic Speech Recognition Decoder
— Unverified 0KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025 May 19, 2025 Automatic Speech Recognition Instruction Following
— Unverified 0Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR May 19, 2025 Automatic Speech Recognition Graph Matching
— Unverified 0ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Automatic Speech Recognition for African Low-Resource Languages: Challenges and Future Directions May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Multi-Stage Speaker Diarization for Noisy Classrooms May 16, 2025 Action Detection Activity Detection
Code Code Available 0