Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC Sep 19, 2024 Disentanglement speech-recognition
Code Code Available 1Approaching Deep Learning through the Spectral Dynamics of Weights Aug 21, 2024 Deep Learning image-classification
Code Code Available 1SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion Recognition Aug 14, 2024 Automatic Speech Recognition Benchmarking
Code Code Available 1LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition Aug 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features Aug 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation Jul 26, 2024 Contrastive Learning speech-recognition
Code Code Available 1Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing Jul 26, 2024 Attribute Language Modelling
Code Code Available 1Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction Jul 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1dMel: Speech Tokenization made Simple Jul 22, 2024 Decoder Language Modeling
Code Code Available 1Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for Polish Jul 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors Jul 16, 2024 Automatic Phoneme Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System Jul 13, 2024 Decoder speech-recognition
Code Code Available 1Tailored Design of Audio-Visual Speech Recognition Models using Branchformers Jul 9, 2024 Audio-Visual Speech Recognition speech-recognition
Code Code Available 1Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models Jul 5, 2024 Adversarial Attack Automatic Speech Recognition
Code Code Available 1Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition Jul 4, 2024 Audio-Visual Speech Recognition speech-recognition
Code Code Available 1Improving Self-supervised Pre-training using Accent-Specific Codebooks Jul 4, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models Jul 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Jun 26, 2024 ArzEn Code-switched Translation to ara ArzEn Code-switched Translation to eng
Code Code Available 1Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet Jun 25, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model Jun 25, 2024 Automatic Lyrics Transcription Automatic Speech Recognition
Code Code Available 1Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech Jun 16, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Prompting Large Language Models with Audio for General-Purpose Speech Summarization Jun 10, 2024 speech-recognition Speech Recognition
Code Code Available 1LLM-based speaker diarization correction: A generalizable approach Jun 7, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement Jun 6, 2024 Diversity Speech Enhancement
Code Code Available 1LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation Jun 6, 2024 es-en speech-recognition
Code Code Available 1A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition May 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset May 12, 2024 Action Spotting Automatic Speech Recognition
Code Code Available 1Watch Your Mouth: Silent Speech Recognition with Depth Sensing May 11, 2024 Deep Learning Lipreading
Code Code Available 1Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models May 9, 2024 Adversarial Attack Automatic Speech Recognition
Code Code Available 1Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets May 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Less Peaky and More Accurate CTC Forced Alignment by Label Priors Apr 22, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models Apr 3, 2024 Optical Character Recognition (OCR) speech-recognition
Code Code Available 1Kallaama: A Transcribed Speech Dataset about Agriculture in the Three Most Widely Spoken Languages in Senegal Apr 2, 2024 Automatic Speech Recognition speech-recognition
Code Code Available 1FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer Mar 19, 2024 Representation Learning speech-recognition
Code Code Available 1Real-Time Multimodal Cognitive Assistant for Emergency Medical Services Mar 11, 2024 Action Recognition Edge-computing
Code Code Available 1Speech Robust Bench: A Robustness Benchmark For Speech Recognition Mar 8, 2024 Adversarial Robustness Automatic Speech Recognition
Code Code Available 1Language and Speech Technology for Central Kurdish Varieties Mar 4, 2024 Automatic Speech Recognition Diversity
Code Code Available 1A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition Mar 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition Feb 8, 2024 Audio-Visual Speech Recognition Automatic Speech Recognition
Code Code Available 1REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR Feb 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion Jan 25, 2024 speech-recognition Speech Recognition
Code Code Available 1Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric Jan 20, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation Jan 18, 2024 Sentence speech-recognition
Code Code Available 1Cross-Speaker Encoding Network for Multi-Talker Speech Recognition Jan 8, 2024 Decoder speech-recognition
Code Code Available 1The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023 Jan 7, 2024 Decoder speech-recognition
Code Code Available 1FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge Dec 15, 2023 Backdoor Attack Data Poisoning
Code Code Available 1Personalized Autonomous Driving with Large Language Models: Field Experiments Dec 14, 2023 Autonomous Driving Autonomous Vehicles
Code Code Available 1Extending Whisper with prompt tuning to target-speaker ASR Dec 13, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Graph Convolutions Enrich the Self-Attention in Transformers! Dec 7, 2023 Clone Detection
Code Code Available 1