| SONAR: Sentence-Level Multimodal and Language-Agnostic Representations | Aug 22, 2023 | DecoderMachine Translation | CodeCode Available | 2 |
| SeamlessM4T: Massively Multilingual & Multimodal Machine Translation | Aug 22, 2023 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 2 |
| Towards an AI to Win Ghana's National Science and Maths Quiz | Aug 8, 2023 | MathQuestion Answering | CodeCode Available | 1 |
| Let's Give a Voice to Conversational Agents in Virtual Reality | Aug 4, 2023 | Speech-to-Texttext-to-speech | CodeCode Available | 0 |
| N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets | Aug 4, 2023 | Speech-to-Text | —Unverified | 0 |
| Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN | Jul 24, 2023 | Automatic Speech RecognitionSentiment Analysis | CodeCode Available | 0 |
| A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality Conversion | Jul 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Improving RNN-Transducers with Acoustic LookAhead | Jul 11, 2023 | HallucinationSpeech-to-Text | —Unverified | 0 |
| On decoder-only architecture for speech-to-text and large language model integration | Jul 8, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M | Jul 6, 2023 | Speech-to-Text | —Unverified | 0 |