Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition Feb 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Computer-Generated Music for Tabletop Role-Playing Games Aug 16, 2020 speech-recognition Speech Recognition
Code Code Available 15 Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition Mar 6, 2020 Lipreading Lip Reading
Code Code Available 15 NICE: Noise Injection and Clamping Estimation for Neural Network Quantization Sep 29, 2018 General Classification GPU
Code Code Available 15 Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling Oct 8, 2020 Speech Recognition text-to-speech
Code Code Available 15 Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners May 22, 2022 Attribute Automatic Speech Recognition
Code Code Available 15 ASR2K: Speech Recognition for Around 2000 Languages without Audio Sep 6, 2022 Language Modeling Language Modelling
Code Code Available 15 ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Online Neural Networks for Change-Point Detection Oct 3, 2020 Change Point Detection speech-recognition
Code Code Available 15 BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition Jun 30, 2019 Avg Representation Learning
Code Code Available 15 On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems Nov 29, 2022 speech-recognition Speech Recognition
Code Code Available 15 Continuous speech separation: dataset and analysis Jan 30, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages Jun 13, 2023 Contrastive Learning speech-recognition
Code Code Available 15 Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models Jul 5, 2024 Adversarial Attack Automatic Speech Recognition
Code Code Available 15 A transfer learning based approach for pronunciation scoring Nov 1, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CopyNE: Better Contextual ASR by Copying Named Entities May 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Personalized Autonomous Driving with Large Language Models: Field Experiments Dec 14, 2023 Autonomous Driving Autonomous Vehicles
Code Code Available 15 Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures Jul 27, 2023 Automatic Speech Recognition Contrastive Learning
Code Code Available 15 LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT Mar 29, 2022 All Automatic Speech Recognition
Code Code Available 15 CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Jul 4, 2022 Compiler Optimization image-classification
Code Code Available 15 Monotonic Chunkwise Attention Dec 14, 2017 Document Summarization speech-recognition
Code Code Available 15 Cross Attention Augmented Transducer Networks for Simultaneous Translation Nov 1, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Recent improvements of ASR models in the face of adversarial attacks Mar 29, 2022 speech-recognition Speech Recognition
Code Code Available 15 Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Mar 30, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CTC-synchronous Training for Monotonic Attention Model May 10, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset Oct 9, 2021 Deep Learning Emotion Recognition
Code Code Available 15 DARF: A data-reduced FADE version for simulations of speech recognition thresholds with real hearing aids Jul 10, 2020 Sentence speech-recognition
Code Code Available 15 KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition from YouTube Videos Mar 1, 2019 speech-recognition Speech Recognition
Code Code Available 05 Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset Nov 29, 2019 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Keyphrase Cloud Generation of Broadcast News Jun 19, 2013 Keyphrase Extraction speech-recognition
Code Code Available 05 Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic Information Apr 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition Oct 23, 2023 Automatic Speech Recognition speech-recognition
Code Code Available 05 An End-to-End Neural Network for Polyphonic Piano Music Transcription Aug 7, 2015 Language Modeling Language Modelling
Code Code Available 05 A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos Jul 20, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning Sep 21, 2016 Decoder Multi-Task Learning
Code Code Available 05 Jasper: An End-to-End Convolutional Neural Acoustic Model Apr 5, 2019 Decoder Language Modeling
Code Code Available 05 Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding Jan 13, 2025 Automatic Speech Recognition intent-classification
Code Code Available 05 Iterative Pseudo-Labeling for Speech Recognition May 19, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation Oct 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with Aphasia Jun 10, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Investigating the Effects of Word Substitution Errors on Sentence Embeddings Nov 16, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Investigating the Emergent Audio Classification Ability of ASR Foundation Models Nov 15, 2023 Audio Classification Decoder
Code Code Available 05 Intrinsic evaluation of language models for code-switching Nov 1, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition Jul 27, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition Mar 27, 2018 Robust Speech Recognition Speech Dereverberation
Code Code Available 05 Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations Aug 14, 2023 Action Detection Activity Detection
Code Code Available 05 An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification Aug 22, 2023 Self-Supervised Learning Speaker Identification
Code Code Available 05 Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition Nov 16, 2021 Language Modeling Language Modelling
Code Code Available 05 A Dataset for Speech Emotion Recognition in Greek Theatrical Plays Mar 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Interpersonal Relationship Labels for the CALLHOME Corpus May 1, 2018 Automatic Speech Recognition (ASR) Speech Recognition
Code Code Available 05