Whispering in Amharic: Fine-tuning Whisper for Low-resource Language Mar 24, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0From S4 to Mamba: A Comprehensive Survey on Structured State Space Models Mar 22, 2025 Computational Efficiency Mamba
— Unverified 0Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication Mar 21, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Mar 20, 2025 speaker-diarization Speaker Diarization
— Unverified 0A Comprehensive Survey on Architectural Advances in Deep CNNs: Challenges, Applications, and Emerging Research Directions Mar 19, 2025 Action Recognition Computational Efficiency
— Unverified 0Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces Mar 19, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Halving transcription time: A fast, user-friendly and GDPR-compliant workflow to create AI-assisted transcripts for content analysis Mar 17, 2025 Automatic Speech Recognition speech-recognition
— Unverified 0MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens Mar 14, 2025 Audio-Visual Speech Recognition Computational Efficiency
Code Code Available 1Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA Mar 13, 2025 Automatic Speech Recognition parameter-efficient fine-tuning
— Unverified 0Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings Mar 13, 2025 Speaker Identification speech-recognition
Code Code Available 1Proceedings of the ISCA/ITG Workshop on Diversity in Large Speech and Language Models Mar 12, 2025 Diversity General Knowledge
— Unverified 0ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization Mar 12, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Quantization for OpenAI's Whisper Models: A Comparative Analysis Mar 12, 2025 Quantization speech-recognition
Code Code Available 0Everything Can Be Described in Words: A Simple Unified Multi-Modal Framework with Semantic and Temporal Alignment Mar 12, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Lend a Hand: Semi Training-Free Cued Speech Recognition via MLLM-Driven Hand Modeling for Barrier-free Communication Mar 11, 2025 Lip Reading Prompt Engineering
Code Code Available 0An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR Mar 11, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Automatic Speech Recognition for Non-Native English: Accuracy and Disfluency Handling Mar 10, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Building English ASR model with regional language support Mar 10, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs Mar 9, 2025 Audio-Visual Speech Recognition Computational Efficiency
— Unverified 0A Noise-Robust Turn-Taking System for Real-World Dialogue Robots: A Field Experiment Mar 8, 2025 speech-recognition Speech Recognition
Code Code Available 2Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations Mar 8, 2025 Audio-Visual Speech Recognition Multi-Task Learning
Code Code Available 1A Causal Inference Approach for Quantifying Research Impact Mar 7, 2025 Causal Inference counterfactual
— Unverified 0Self-Supervised Models for Phoneme Recognition: Applications in Children's Speech for Reading Learning Mar 6, 2025 Phoneme Recognition Self-Supervised Learning
— Unverified 0From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment Mar 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Qieemo: Speech Is All You Need in the Emotion Recognition in Conversations Mar 5, 2025 All Automatic Speech Recognition
— Unverified 0CORDIC Is All You Need Mar 4, 2025 All speech-recognition
— Unverified 0Direct Speech to Speech Translation: A Review Mar 3, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Fine-Tuning Whisper for Inclusive Prosodic Stress Analysis Mar 3, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems Mar 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation Mar 2, 2025 Decoder Representation Learning
— Unverified 0LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation Feb 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR Feb 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications Feb 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision Feb 26, 2025 Audio Synthesis Automatic Speech Recognition
— Unverified 0CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition Feb 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Exploring Gender Disparities in Automatic Speech Recognition Technology Feb 25, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM Feb 24, 2025 Automatic Speech Recognition Language Modeling
— Unverified 0Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation Feb 24, 2025 Automatic Speech Recognition Diversity
— Unverified 0Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus Feb 24, 2025 Automatic Speech Recognition (ASR) speech-recognition
Code Code Available 0Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration Feb 22, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages Feb 21, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders Feb 21, 2025 Audio captioning Automatic Speech Recognition
— Unverified 0Retrieval-Augmented Speech Recognition Approach for Domain Challenges Feb 21, 2025 Decoder RAG
— Unverified 0WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models Feb 20, 2025 Automatic Speech Recognition RAG
— Unverified 0Moshi Moshi? A Model Selection Hijacking Adversarial Attack Feb 20, 2025 Adversarial Attack Computational Efficiency
— Unverified 0Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks Feb 19, 2025 Automatic Speech Recognition speech-recognition
Code Code Available 0Adopting Whisper for Confidence Estimation Feb 19, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models Feb 18, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0On the Robust Approximation of ASR Metrics Feb 18, 2025 speech-recognition Speech Recognition
— Unverified 0Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization Feb 18, 2025 Automatic Speech Recognition Speaker Identification
— Unverified 0