Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language May 20, 2025 Multi-Task Learning Sign Language Recognition
Code Code Available 0PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0From Weak Labels to Strong Results: Utilizing 5,000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach May 20, 2025 Audio-Visual Speech Recognition Mixture-of-Experts
— Unverified 0HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing May 20, 2025 Language Modeling Language Modelling
— Unverified 0Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English May 20, 2025 Automatic Speech Recognition speech-recognition
— Unverified 0In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition May 20, 2025 Audio-Visual Speech Recognition speaker-diarization
— Unverified 0Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages May 20, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference May 20, 2025 Quantization speech-recognition
— Unverified 0KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025 May 19, 2025 Automatic Speech Recognition Instruction Following
— Unverified 0Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR May 19, 2025 Automatic Speech Recognition Graph Matching
— Unverified 0Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down May 19, 2025 Automatic Speech Recognition Decoder
— Unverified 0Granary: Speech Recognition and Translation Dataset in 25 European Languages May 19, 2025 Hallucination Punctuation Restoration
— Unverified 0Automatic Speech Recognition for African Low-Resource Languages: Challenges and Future Directions May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Multi-Stage Speaker Diarization for Noisy Classrooms May 16, 2025 Action Detection Activity Detection
Code Code Available 0LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models May 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio May 15, 2025 Speaker Identification speech-recognition
— Unverified 0Inclusivity of AI Speech in Healthcare: A Decade Look Back May 15, 2025 speech-recognition Speech Recognition
— Unverified 0Full simulation on the dynamics of auditory synaptic fusion: Strong clustering of calcium channel might be the origin of the coherent release in the auditory hair cells May 12, 2025 speech-recognition Speech Recognition
— Unverified 0Remote Rowhammer Attack using Adversarial Observations on Federated Learning Clients May 9, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations May 8, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer May 7, 2025 Audio-Visual Speech Recognition Lip Reading
— Unverified 0Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement May 7, 2025 Robust Speech Recognition Speech Enhancement
— Unverified 0SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation May 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech May 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Transforming faces into video stories -- VideoFace2.0 May 4, 2025 Face Detection Face Recognition
Code Code Available 0A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction May 4, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Transfer Learning-Based Deep Residual Learning for Speech Recognition in Clean and Noisy Environments May 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition Apr 30, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction Apr 30, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Development and evaluation of a deep learning algorithm for German word recognition from lip movements Apr 22, 2025 Lip Reading speech-recognition
— Unverified 0TinyML for Speech Recognition Apr 22, 2025 speech-recognition Speech Recognition
Code Code Available 0StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models Apr 21, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides Apr 21, 2025 Audio-Visual Speech Recognition Automatic Speech Recognition
— Unverified 0Acoustic to Articulatory Inversion of Speech; Data Driven Approaches, Challenges, Applications, and Future Scope Apr 17, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning Apr 16, 2025 Arabic Speech Recognition Automatic Speech Recognition
— Unverified 0Dysarthria Normalization via Local Lie Group Transformations for Robust ASR Apr 16, 2025 Robust Speech Recognition speech-recognition
Code Code Available 0Spatial Audio Processing with Large Language Model on Wearable Devices Apr 11, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Summarizing Speech: A Comprehensive Survey Apr 10, 2025 Meeting Summarization speech-recognition
— Unverified 0Visual-Aware Speech Recognition for Noisy Scenarios Apr 9, 2025 Audio-Visual Speech Recognition Automatic Speech Recognition
— Unverified 0RNN-Transducer-based Losses for Speech Recognition on Noisy Targets Apr 9, 2025 speech-recognition Speech Recognition
Code Code Available 0DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation Apr 7, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0A Human Digital Twin Architecture for Knowledge-based Interactions and Context-Aware Conversations Apr 4, 2025 speech-recognition Speech Recognition
— Unverified 0Edge Intelligence for Wildlife Conservation: Real-Time Hornbill Call Classification Using TinyML Apr 3, 2025 Edge-computing speech-recognition
— Unverified 0LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect Apr 3, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Chain of Correction for Full-text Speech Recognition with Large Language Models Apr 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0