AVATAR: Unconstrained Audiovisual Speech Recognition Jun 15, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition May 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 AV Taris: Online Audio-Visual Speech Recognition Dec 14, 2020 Action Detection Activity Detection
Code Code Available 15 Back Translation for Speech-to-text Translation Without Transcripts May 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation Nov 17, 2024 Action Recognition backdoor defense
Code Code Available 15 Adaptation of Whisper models to child speech recognition Jul 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for Polish Jul 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm Dec 11, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Attention model for articulatory features detection Jul 2, 2019 Manner Of Articulation Detection model
Code Code Available 15 Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings Apr 8, 2021 Emotion Recognition Speech Emotion Recognition
Code Code Available 15 Adapting End-to-End Speech Recognition for Readable Subtitles May 25, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks Oct 7, 2016 Anomaly Detection Automatic Speech Recognition
Code Code Available 15 BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data Jan 28, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models Feb 9, 2025 Audio-Visual Speech Recognition Automatic Speech Recognition
Code Code Available 15 Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement Jun 6, 2024 Diversity Speech Enhancement
Code Code Available 15 End-to-end Named Entity Recognition from English Speech May 22, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 End-to-End Speech Recognition and Disfluency Removal Sep 22, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 End-to-End Speech Recognition from Federated Acoustic Models Apr 29, 2021 2k 4k
Code Code Available 15 Adapting Pretrained Transformer to Lattices for Spoken Language Understanding Nov 2, 2020 Natural Language Understanding speech-recognition
Code Code Available 15 BIG-C: a Multimodal Multi-Purpose Dataset for Bemba May 26, 2023 Machine Translation speech-recognition
Code Code Available 15 ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition Oct 24, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Espresso: A Fast End-to-end Neural Speech Recognition Toolkit Sep 18, 2019 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays Nov 3, 2020 Diversity Noise Estimation
Code Code Available 15 Attention-Based Models for Speech Recognition Jun 24, 2015 Machine Translation Phoneme Recognition
Code Code Available 15 A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline Sep 22, 2020 speech-recognition Speech Recognition
Code Code Available 15 BrainBERT: Self-supervised representation learning for intracranial recordings Feb 28, 2023 Language Modeling Language Modelling
Code Code Available 15 Bridging the Gap between Spatial and Spectral Domains: A Unified Framework for Graph Neural Networks Jul 21, 2021 Image Classification Natural Language Understanding
Code Code Available 15 Extending Whisper with prompt tuning to target-speaker ASR Dec 13, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text Apr 3, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition Sep 21, 2023 speech-recognition Speech Recognition
Code Code Available 15 DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering Mar 9, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 15 Can Contextual Biasing Remain Effective with Whisper and GPT-2? Jun 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Calibrating Transformers via Sparse Gaussian Processes Mar 4, 2023 Bayesian Inference Gaussian Processes
Code Code Available 15 Can we use Common Voice to train a Multi-Speaker TTS system? Oct 12, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings Jun 6, 2021 Machine Translation speech-recognition
Code Code Available 15 A Comparison of Methods for OOV-word Recognition on a New Public Dataset Jul 16, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition Mar 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Attack on practical speaker verification system using universal adversarial perturbations May 19, 2021 Real-World Adversarial Attack Room Impulse Response (RIR)
Code Code Available 15 A transfer learning based approach for pronunciation scoring Nov 1, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 AdaScale SGD: A User-Friendly Algorithm for Distributed Training Jul 9, 2020 image-classification Image Classification
Code Code Available 15 FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer Mar 19, 2024 Representation Learning speech-recognition
Code Code Available 15 Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition Sep 5, 2018 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Foundation Transformers Oct 12, 2022 Language Modeling Language Modelling
Code Code Available 15 Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts Nov 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition Jan 11, 2022 Audio-Visual Speech Recognition speech-recognition
Code Code Available 15 DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model Jun 2, 2023 speech-recognition Speech Recognition
Code Code Available 15 ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications Nov 8, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement Jun 22, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition May 19, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15