| Direct Punjabi to English speech translation using discrete units | Feb 25, 2024 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 |
| Hands-Free VR | Feb 23, 2024 | DiversityLanguage Modelling | —Unverified | 0 |
| OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification | Feb 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? | Feb 19, 2024 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| Syllable based DNN-HMM Cantonese Speech to Text System | Feb 13, 2024 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Careless Whisper: Speech-to-Text Hallucination Harms | Feb 12, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data | Feb 8, 2024 | named-entity-recognitionNamed Entity Recognition | —Unverified | 0 |
| Digits micro-model for accurate and secure transactions | Feb 2, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Case Study on Filtering for End-to-End Speech Translation | Feb 2, 2024 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 |
| Streaming Sequence Transduction through Dynamic Compression | Feb 2, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |