ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus Jul 29, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition May 27, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Jun 16, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 AVATAR: Unconstrained Audiovisual Speech Recognition Jun 15, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 AV Taris: Online Audio-Visual Speech Recognition Dec 14, 2020 Action Detection Activity Detection
Code Code Available 15 BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation Nov 17, 2024 Action Recognition backdoor defense
Code Code Available 15 Back Translation for Speech-to-text Translation Without Transcripts May 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings Jun 30, 2023 Audio Classification speech-recognition
Code Code Available 15 BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data Jan 28, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Brazilian Portuguese Speech Recognition Using Wav2vec 2.0 Jul 23, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition Jul 4, 2024 Audio-Visual Speech Recognition speech-recognition
Code Code Available 15 BembaSpeech: A Speech Recognition Corpus for the Bemba Language Feb 9, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications Oct 12, 2021 Action Detection Activity Detection
Code Code Available 15 ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement Jun 6, 2024 Diversity Speech Enhancement
Code Code Available 15 Low-Latency Speech Separation Guided Diarization for Telephone Conversations Apr 5, 2022 Action Detection Activity Detection
Code Code Available 15 A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One Feb 20, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English Aug 3, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 ITALIC: An Italian Intent Classification Dataset Jun 14, 2023 Classification intent-classification
Code Code Available 15 Librispeech Transducer Model with Internal Language Model Prior Correction Apr 7, 2021 Language Modeling Language Modelling
Code Code Available 15 BrainBERT: Self-supervised representation learning for intracranial recordings Feb 28, 2023 Language Modeling Language Modelling
Code Code Available 15 Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps Dec 29, 2020 All image-classification
Code Code Available 15 BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing Sep 2, 2023 speech-recognition Speech Recognition
Code Code Available 15 A Resource for Computational Experiments on Mapudungun Dec 4, 2019 Machine Translation speech-recognition
Code Code Available 15 A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation Oct 24, 2022 Action Detection Activity Detection
Code Code Available 15 Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling Sep 6, 2020 feature selection speech-recognition
Code Code Available 15 Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 15 A context-aware knowledge transferring strategy for CTC-based ASR Oct 12, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Calibrating Transformers via Sparse Gaussian Processes Mar 4, 2023 Bayesian Inference Gaussian Processes
Code Code Available 15 Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition Mar 6, 2020 Lipreading Lip Reading
Code Code Available 15 Integrating Lattice-Free MMI into End-to-End Speech Recognition Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CB-Conformer: Contextual biasing Conformer for biased word recognition Apr 19, 2023 Automatic Speech Recognition Language Modeling
Code Code Available 15 Long Expressive Memory for Sequence Modeling Oct 10, 2021 Language Modeling Language Modelling
Code Code Available 15 Advancing Test-Time Adaptation in Wild Acoustic Test Settings Oct 14, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Persian ASR-based SER: Modification of Sharif Emotional Speech Database and Investigation of Persian Text Corpora Nov 18, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 An exact mapping between the Variational Renormalization Group and Deep Learning Oct 14, 2014 Deep Learning speech-recognition
Code Code Available 15 Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset Oct 9, 2021 Deep Learning Emotion Recognition
Code Code Available 15 ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-Formula Dec 20, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition Oct 11, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition May 27, 2019 Decoder Language Modelling
Code Code Available 15 Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art Dec 23, 2021 Adversarial Attack Malware Detection
Code Code Available 15 CL-MASR: A Continual Learning Benchmark for Multilingual ASR Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Incorporating External POS Tagger for Punctuation Restoration Jun 12, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities Feb 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models Apr 3, 2024 Optical Character Recognition (OCR) speech-recognition
Code Code Available 15 indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languages Mar 31, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications Apr 30, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Jun 26, 2024 ArzEn Code-switched Translation to ara ArzEn Code-switched Translation to eng
Code Code Available 15