Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System Jul 13, 2024 Decoder speech-recognition
Code Code Available 1Tamil Language Computing: the Present and the Future Jul 11, 2024 Language Modelling Machine Translation
— Unverified 0Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition Jul 10, 2024 speech-recognition Speech Recognition
— Unverified 0Explaining Spectrograms in Machine Learning: A Study on Neural Networks for Speech Classification Jul 10, 2024 Classification speech-recognition
Code Code Available 0HebDB: a Weakly Supervised Dataset for Hebrew Speech Processing Jul 10, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks Jul 10, 2024 Language Modeling Language Modelling
— Unverified 0A voice and speech corpus of patients who underwent upper airway surgery in pre- and post-operative states Jul 9, 2024 Articles Classification
Code Code Available 0Tailored Design of Audio-Visual Speech Recognition Models using Branchformers Jul 9, 2024 Audio-Visual Speech Recognition speech-recognition
Code Code Available 1Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation Jul 8, 2024 Automatic Speech Recognition Emotion Recognition
— Unverified 0Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation Jul 8, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Morse Code-Enabled Speech Recognition for Individuals with Visual and Hearing Impairments Jul 7, 2024 speech-recognition Speech Recognition
— Unverified 0CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens Jul 7, 2024 Language Modelling Large Language Model
Code Code Available 11Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models Jul 5, 2024 Automatic Speech Recognition speech-recognition
— Unverified 0Semi-supervised Learning for Code-Switching ASR with Large Language Model Filter Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LearnerVoice: A Dataset of Non-Native English Learners' Spontaneous Speech Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Written Term Detection Improves Spoken Term Detection Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units Jul 5, 2024 Acoustic Unit Discovery Automatic Speech Recognition
Code Code Available 2Romanization Encoding For Multilingual ASR Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Multitaper mel-spectrograms for keyword spotting Jul 5, 2024 Keyword Spotting speech-recognition
— Unverified 0Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models Jul 5, 2024 Adversarial Attack Automatic Speech Recognition
Code Code Available 1Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models Jul 5, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation Jul 4, 2024 Machine Translation speech-recognition
— Unverified 0Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition Jul 4, 2024 Audio-Visual Speech Recognition speech-recognition
Code Code Available 1Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis Jul 4, 2024 Accented Speech Recognition Automatic Speech Recognition
— Unverified 0FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Jul 4, 2024 Emotion Recognition Event Detection
Code Code Available 11Improving Self-supervised Pre-training using Accent-Specific Codebooks Jul 4, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Serialized Output Training by Learned Dominance Jul 4, 2024 Decoder speech-recognition
— Unverified 0Multi-Convformer: Extending Conformer with Multiple Convolution Kernels Jul 4, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition Jul 3, 2024 Alzheimer's Disease Detection Self-Supervised Learning
— Unverified 0Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations Jul 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Advanced Framework for Animal Sound Classification With Features Optimization Jul 3, 2024 Classification Diversity
— Unverified 0Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition Jul 3, 2024 speech-recognition Speech Recognition
— Unverified 0The USTC-NERCSLIP Systems for The ICMC-ASR Challenge Jul 2, 2024 Automatic Speech Recognition Pseudo Label
— Unverified 0Towards the Next Frontier in Speech Representation Learning Using Disentanglement Jul 2, 2024 Disentanglement Representation Learning
— Unverified 0Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models Jul 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Toward Automated Detection of Biased Social Signals from the Content of Clinical Conversations Jul 1, 2024 Fairness speech-recognition
— Unverified 0Cross-Lingual Transfer Learning for Speech Translation Jul 1, 2024 Cross-Lingual Transfer Decoder
— Unverified 0Less Forgetting for Better Generalization: Exploring Continual-learning Fine-tuning Methods for Speech Self-supervised Representations Jun 30, 2024 Continual Learning Domain Generalization
— Unverified 0Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition Jun 29, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Open-Source Conversational AI with SpeechBrain 1.0 Jun 29, 2024 Language Modeling Language Modelling
— Unverified 0Less is More: Accurate Speech Recognition & Translation without Web-Scale Data Jun 28, 2024 Decoder Machine Translation
— Unverified 0Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects Jun 27, 2024 Automatic Speech Recognition Machine Translation
Code Code Available 0Tradition or Innovation: A Comparison of Modern ASR Methods for Forced Alignment Jun 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network Jun 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over Jun 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR Jun 26, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Dynamic Data Pruning for Automatic Speech Recognition Jun 26, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Automatic Speech Recognition for Hindi Jun 26, 2024 Action Detection Activity Detection
— Unverified 0