| A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition | Nov 7, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Personalizing ASR for Dysarthric and Accented Speech with Limited Data | Jul 31, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| The OCON model: an old but gold solution for distributable supervised classification | Oct 5, 2024 | Automatic Speech RecognitionClassification | CodeCode Available | 0 |
| ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit | Oct 24, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation | Jan 7, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 0 |
| PersonaTAB: Predicting Personality Traits using Textual, Acoustic, and Behavioral Cues in Fully-Duplex Speech Dialogs | May 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Transformer Based Punctuation Restoration for Turkish | Sep 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Multi-Stage Speaker Diarization for Noisy Classrooms | May 16, 2025 | Action DetectionActivity Detection | CodeCode Available | 0 |
| Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords | Feb 3, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training | Dec 3, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |