| NusaCrowd: Open Source Initiative for Indonesian NLP Resources | Dec 19, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Towards A Unified Conformer Structure: from ASR to ASV Task | Nov 14, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement | Sep 22, 2022 | Audio Super-ResolutionAutomatic Speech Recognition | CodeCode Available | 2 |
| SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning | Jun 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Squeezeformer: An Efficient Transformer for Automatic Speech Recognition | Jun 2, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| 4-bit Conformer with Native Quantization Aware Training for Speech Recognition | Mar 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| CMGAN: Conformer-based Metric GAN for Speech Enhancement | Mar 28, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | Jan 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Robust Self-Supervised Audio-Visual Speech Recognition | Jan 5, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Fast Transformers with Clustered Attention | Jul 9, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities | May 23, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition | May 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages | Mar 30, 2025 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 1 |
| DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities | Feb 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR Identification | Feb 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models | Feb 9, 2025 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 1 |
| Sagalee: an Open Source Automatic Speech Recognition Dataset for Oromo Language | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| FlanEC: Exploring Flan-T5 for Post-ASR Error Correction | Jan 22, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation | Jan 1, 2025 | Automatic Speech RecognitionDecoder | CodeCode Available | 1 |
| MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-Formula | Dec 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection | Nov 15, 2024 | Audio Deepfake DetectionAutomatic Speech Recognition | CodeCode Available | 1 |
| Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention | Oct 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| VHASR: A Multimodal Speech Recognition System With Vision Hotwords | Oct 1, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Mamba for Streaming ASR Combined with Unimodal Aggregation | Sep 30, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |