| I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs | Jun 17, 2025 | 3D visual groundingContrastive Learning | —Unverified | 0 | 0 |
| Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages | Nov 11, 2024 | DecoderMachine Translation | —Unverified | 0 | 0 |
| DARTS: Dialectal Arabic Transcription System | Sep 26, 2019 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR | Nov 7, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| CTC Alignments Improve Autoregressive Translation | Oct 11, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| A Voice Controlled E-Commerce Web Application | Nov 16, 2018 | Medical Diagnosisspeech-recognition | —Unverified | 0 | 0 |
| An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments | Jul 14, 2025 | Speech-to-Texttext-to-speech | —Unverified | 0 | 0 |
| A Dutch Dysarthric Speech Database for Individualized Speech Therapy Research | May 1, 2016 | SentenceSpeech-to-Text | —Unverified | 0 | 0 |
| A combined approach to the analysis of speech conversations in a contact center domain | Mar 12, 2022 | Speech-to-Text | —Unverified | 0 | 0 |
| A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect | May 7, 2021 | BenchmarkingSpeech-to-Text | —Unverified | 0 | 0 |
| Multilingual Speech Translation with Efficient Finetuning of Pretrained Models | Oct 24, 2020 | Cross-Lingual TransferDecoder | —Unverified | 0 | 0 |
| Inductive biases, pretraining and fine-tuning jointly account for brain responses to speech | Feb 25, 2021 | Scene ClassificationSpeech-to-Text | —Unverified | 0 | 0 |
| Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing | Sep 27, 2023 | DecoderMachine Translation | —Unverified | 0 | 0 |
| Incorporating Domain Knowledge To Improve Topic Segmentation Of Long MOOC Lecture Videos | Dec 8, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Improving the previous state-of-the-art Frisian ASR by fine-tuning XLS-R | Mar 31, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Cross-modal Contrastive Learning for Speech Translation | Dec 17, 2021 | Contrastive LearningRetrieval | —Unverified | 0 | 0 |
| Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions | Jun 30, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task | Jul 12, 2021 | DecoderKnowledge Distillation | —Unverified | 0 | 0 |
| Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses | Oct 20, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Crossing the SSH Bridge with Interview Data | May 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Metrics for Speech Translation | May 22, 2023 | Speech-to-TextTranslation | —Unverified | 0 | 0 |
| Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model | Feb 27, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Language and Modality Transfer in Translation by Character-level Modeling | May 30, 2025 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 | 0 |
| Automated Testing of AI Models | Oct 7, 2021 | FairnessSpeech-to-Text | —Unverified | 0 | 0 |
| Analyzing ASR pretraining for low-resource speech-to-text translation | Oct 23, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech | Aug 10, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation | Jun 9, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving RNN-Transducers with Acoustic LookAhead | Jul 11, 2023 | HallucinationSpeech-to-Text | —Unverified | 0 | 0 |
| Improving Autoregressive NLP Tasks via Modular Linearized Attention | Apr 17, 2023 | Computational EfficiencyMachine Translation | —Unverified | 0 | 0 |
| Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit | Mar 26, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Improve Sinhala Speech Recognition Through e2e LF-MMI Model | Dec 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach | Oct 6, 2023 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving | Jun 16, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| IMS-Speech: A Speech to Text Tool | Aug 13, 2019 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| AudioPaLM: A Large Language Model That Can Speak and Listen | Jun 22, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Improved Cross-Lingual Transfer Learning For Automatic Speech Translation | Jun 1, 2023 | automatic-speech-translationCross-Lingual Transfer | —Unverified | 0 | 0 |
| Impact of Microphone position Measurement Error on Multi Channel Distant Speech Recognition & Intelligibility | Dec 1, 2021 | Distant Speech RecognitionPosition | —Unverified | 0 | 0 |
| COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning | Nov 3, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Infusing Future Information into Monotonic Attention Through Language Models | Sep 7, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text | Sep 17, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Instance-Based Model Adaptation For Direct Speech Translation | Oct 23, 2019 | Domain AdaptationSpeech-to-Text | —Unverified | 0 | 0 |
| Interpreting Strategies Annotation in the WAW Corpus | Sep 1, 2017 | Machine TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Investigating Decoder-only Large Language Models for Speech-to-text Translation | Jul 3, 2024 | Decoderparameter-efficient fine-tuning | —Unverified | 0 | 0 |
| Corpus Creation and Evaluation for Speech-to-Text and Speech Translation | Aug 1, 2021 | Machine TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| A low latency ASR-free end to end spoken language understanding system | Nov 10, 2020 | Speech-to-TextSpoken Language Understanding | —Unverified | 0 | 0 |
| Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks | May 4, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not | Sep 25, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 | 0 |
| Conversational Recommendation System using NLP and Sentiment Analysis | May 17, 2025 | Conversational RecommendationDynamic Time Warping | —Unverified | 0 | 0 |
| How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System? | Dec 24, 2024 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 | 0 |
| Contextualized Translation of Automatically Segmented Speech | Aug 5, 2020 | SegmentationSentence | —Unverified | 0 | 0 |