| VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications | May 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment | Apr 22, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts | Mar 6, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition | Sep 19, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm | Jan 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| What shall we do with an hour of data? Speech recognition for the un- and under-served languages of Common Voice | May 10, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Which French speech recognition system for assistant robots? | Mar 4, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper Finetuning on Nepali Language | Nov 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages | Dec 31, 2024 | Automatic Speech RecognitionData Augmentation | —Unverified | 0 |
| With One Voice: Composing a Travel Voice Assistant from Re-purposed Models | Aug 4, 2021 | BIG-bench Machine Learningnamed-entity-recognition | —Unverified | 0 |
| Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering | Jun 1, 2021 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| XTREME-S: Evaluating Cross-lingual Speech Representations | Mar 21, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| Learnings from Technological Interventions in a Low Resource Language: A Case-Study on Gondi | Apr 21, 2020 | Machine TranslationSpeech-to-Text | —Unverified | 0 |
| Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D | Nov 19, 2024 | Speech-to-Texttext-to-speech | —Unverified | 0 |
| Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation | Nov 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LIA-RAG: a system based on graphs and divergence of probabilities applied to Speech-To-Text Summarization | Jan 26, 2016 | RAGSpeech-to-Text | —Unverified | 0 |
| LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect | Apr 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization | Jun 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation | Feb 24, 2025 | Automatic Speech RecognitionDiversity | —Unverified | 0 |
| Low-Resource Speech-to-Text Translation | Mar 24, 2018 | DecoderMachine Translation | —Unverified | 0 |
| M3ST: Mix at Three Levels for Speech Translation | Dec 7, 2022 | Data AugmentationDiversity | —Unverified | 0 |
| MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation | Oct 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| MinMo: A Multimodal Large Language Model for Seamless Voice Interaction | Jan 10, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |