| Using of heterogeneous corpora for training of an ASR system | Jun 1, 2017 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation | May 25, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| Visual Features for Context-Aware Speech Recognition | Dec 1, 2017 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Voice based self help System: User Experience Vs Accuracy | Apr 7, 2015 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications | May 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment | Apr 22, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts | Mar 6, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition | Sep 19, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm | Jan 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| What shall we do with an hour of data? Speech recognition for the un- and under-served languages of Common Voice | May 10, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Which French speech recognition system for assistant robots? | Mar 4, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper Finetuning on Nepali Language | Nov 19, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages | Dec 31, 2024 | Automatic Speech RecognitionData Augmentation | —Unverified | 0 |
| With One Voice: Composing a Travel Voice Assistant from Re-purposed Models | Aug 4, 2021 | BIG-bench Machine Learningnamed-entity-recognition | —Unverified | 0 |
| Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering | Jun 1, 2021 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| XTREME-S: Evaluating Cross-lingual Speech Representations | Mar 21, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers | Apr 21, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Language Model Augmented Monotonic Attention for Simultaneous Translation | Jul 1, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LASER: Attention with Exponential Transformation | Nov 5, 2024 | Speech-to-Text | —Unverified | 0 |
| LAST: Language Model Aware Speech Tokenization | Sep 5, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| Learnings from Technological Interventions in a Low Resource Language: A Case-Study on Gondi | Apr 21, 2020 | Machine TranslationSpeech-to-Text | —Unverified | 0 |
| Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D | Nov 19, 2024 | Speech-to-Texttext-to-speech | —Unverified | 0 |
| Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation | Nov 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LIA-RAG: a system based on graphs and divergence of probabilities applied to Speech-To-Text Summarization | Jan 26, 2016 | RAGSpeech-to-Text | —Unverified | 0 |
| LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect | Apr 3, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization | Jun 20, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation | Feb 24, 2025 | Automatic Speech RecognitionDiversity | —Unverified | 0 |
| Low-Resource Speech-to-Text Translation | Mar 24, 2018 | DecoderMachine Translation | —Unverified | 0 |
| M3ST: Mix at Three Levels for Speech Translation | Dec 7, 2022 | Data AugmentationDiversity | —Unverified | 0 |
| MAM: Masked Acoustic Modeling for End-to-End Speech-to-Text Translation | Oct 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| MinMo: A Multimodal Large Language Model for Seamless Voice Interaction | Jan 10, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer | Oct 5, 2023 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems | Mar 15, 2021 | Speech-to-Text | —Unverified | 0 |
| Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search | Oct 31, 2022 | Emotion RecognitionNeural Architecture Search | —Unverified | 0 |
| Multilingual Speech Translation from Efficient Finetuning of Pretrained Models | Aug 1, 2021 | DecoderSpeech-to-Text | —Unverified | 0 |
| Multi-teacher Distillation for Multilingual Spelling Correction | Nov 20, 2023 | Multilingual NLPSpeech-to-Text | —Unverified | 0 |
| NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022 | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| NAIST Simultaneous Speech Translation System for IWSLT 2024 | Jun 30, 2024 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 |
| Named Entity Detection and Injection for Direct Speech Translation | Oct 21, 2022 | SentenceSpeech-to-Text | —Unverified | 0 |
| Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data | Feb 8, 2024 | named-entity-recognitionNamed Entity Recognition | —Unverified | 0 |
| Natural Language Interactions in Autonomous Vehicles: Intent Detection and Slot Filling from Passenger Utterances | Apr 23, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Natural Language Robot Programming: NLP integrated with autonomous robotic grasping | Apr 6, 2023 | Robotic GraspingSpeech-to-Text | —Unverified | 0 |
| NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns | Mar 22, 2024 | Speech-to-Text | —Unverified | 0 |
| NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts | Nov 8, 2024 | Mixture-of-ExpertsOptical Character Recognition (OCR) | —Unverified | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 |
| N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets | Aug 4, 2023 | Speech-to-Text | —Unverified | 0 |
| Noise in Speech-to-Text Voice: Analysis of Errors and Feasibility of Phonetic Similarity for Their Correction | Dec 1, 2013 | Decision MakingSpeech Recognition | —Unverified | 0 |
| Numerically Grounded Language Models for Semantic Error Correction | Aug 14, 2016 | Fact CheckingGrammatical Error Correction | —Unverified | 0 |