A New Frontier of AI: On-Device AI Training and Personalization Jun 9, 2022 Efficient Neural Network speech-recognition
Code Code Available 2Visual Speech Recognition for Multiple Languages in the Wild Feb 26, 2022 Hyperparameter Optimization Lipreading
Code Code Available 2MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Oct 1, 2024 Automatic Speech Recognition speech-recognition
Code Code Available 2MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation Mar 1, 2023 Audio-Visual Speech Recognition Robust Speech Recognition
Code Code Available 2NusaCrowd: Open Source Initiative for Indonesian NLP Resources Dec 19, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement Jul 1, 2025 Automatic Speech Recognition Mamba
Code Code Available 2LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation Feb 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges Apr 24, 2024 Drug Design Inductive Bias
Code Code Available 2LightSeq2: Accelerated Training for Transformer-based Models on GPUs Oct 12, 2021 Decoder GPU
Code Code Available 2CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization May 6, 2025 Active Speaker Detection Audio-Visual Speech Recognition
Code Code Available 2Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Jan 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT Oct 7, 2023 Audio captioning Automatic Speech Recognition
Code Code Available 2Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition May 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Liquid Structural State-Space Models Sep 26, 2022 Heart rate estimation Long-range modeling
Code Code Available 2ICASSP 2022 Acoustic Echo Cancellation Challenge Feb 27, 2022 Acoustic echo cancellation Speech Enhancement
Code Code Available 2Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions Sep 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec Sep 14, 2023 Automatic Speech Recognition speech-recognition
Code Code Available 2An Embarrassingly Simple Approach for LLM with Strong ASR Capacity Feb 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension Feb 12, 2024 2k Automatic Speech Recognition
Code Code Available 2Large Language Models are Efficient Learners of Noise-Robust Speech Recognition Jan 19, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography Oct 26, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition Dec 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Fast Transformers with Clustered Attention Jul 9, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Dialectal Coverage And Generalization in Arabic Speech Recognition Nov 7, 2024 Arabic Speech Recognition Automatic Speech Recognition
Code Code Available 2BLSP-Emo: Towards Empathetic Large Speech-Language Models Jun 6, 2024 Emotion Recognition Instruction Following
Code Code Available 2Large Language Models are Strong Audio-Visual Speech Recognition Learners Sep 18, 2024 Audio-Visual Speech Recognition Automatic Speech Recognition
Code Code Available 2Cross-Speaker Encoding Network for Multi-Talker Speech Recognition Jan 8, 2024 Decoder speech-recognition
Code Code Available 1CTC-synchronous Training for Monotonic Attention Model May 10, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Cross Attention Augmented Transducer Networks for Simultaneous Translation Nov 1, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition May 16, 2023 Audio-Visual Speech Recognition Automatic Speech Recognition
Code Code Available 1D4AM: A General Denoising Framework for Downstream Acoustic Models Nov 28, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1CoVoST 2 and Massively Multilingual Speech-to-Text Translation Jul 20, 2020 Machine Translation speech-recognition
Code Code Available 1CopyNE: Better Contextual ASR by Copying Named Entities May 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Convolutional Neural Network (CNN) to reduce construction loss in JPEG compression caused by Discrete Fourier Transform (DFT) Aug 26, 2022 Data Compression Image Compression
Code Code Available 1CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese Oct 14, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Jul 4, 2022 Compiler Optimization image-classification
Code Code Available 1Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities May 23, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1A Comparison of Methods for OOV-word Recognition on a New Public Dataset Jul 16, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech Jun 16, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Continuous speech separation: dataset and analysis Jan 30, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context May 7, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features Aug 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition Jun 30, 2019 Avg Representation Learning
Code Code Available 1Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages Jun 13, 2023 Contrastive Learning speech-recognition
Code Code Available 1Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition Oct 22, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Computer-Generated Music for Tabletop Role-Playing Games Aug 16, 2020 speech-recognition Speech Recognition
Code Code Available 1Compiling ONNX Neural Network Models Using MLIR Aug 19, 2020 speech-recognition Speech Recognition
Code Code Available 1Comparative layer-wise analysis of self-supervised speech models Nov 8, 2022 speech-recognition Speech Recognition
Code Code Available 1Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition Feb 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI Dec 5, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1