| Audio Adversarial Examples: Attacks Using Vocal Masks | Feb 4, 2021 | Adversarial AttackSpeech-to-Text | —Unverified | 0 |
| Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers | Apr 21, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Comparison of SVD and factorized TDNN approaches for speech to text | Oct 13, 2021 | Speech-to-Text | —Unverified | 0 |
| Acquisition of high-quality images for camera calibration in robotics applications via speech prompts | Apr 15, 2025 | Camera CalibrationSpeech-to-Text | —Unverified | 0 |
| Compact Speech Translation Models via Discrete Speech Units Pretraining | Feb 29, 2024 | DecoderSelf-Supervised Learning | —Unverified | 0 |
| Open Brain AI. Automatic Language Assessment | Jun 11, 2023 | Speech-to-Text | —Unverified | 0 |
| Language Model Augmented Monotonic Attention for Simultaneous Translation | Jul 1, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Graph Neural Networks to Predict Customer Satisfaction Following Interactions with a Corporate Call Center | Jan 31, 2021 | Graph Neural NetworkSpeech-to-Text | —Unverified | 0 |
| Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation | Jul 4, 2024 | Machine Translationspeech-recognition | —Unverified | 0 |
| Findings of the Third Workshop on Automatic Simultaneous Translation | Jul 1, 2022 | Speech-to-TextTranslation | —Unverified | 0 |
| Hands-Free VR | Feb 23, 2024 | DiversityLanguage Modelling | —Unverified | 0 |
| Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language | May 6, 2022 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks | Jan 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System? | Dec 24, 2024 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 |
| How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not | Sep 25, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks | May 4, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Findings of the Second Workshop on Automatic Simultaneous Translation | Jun 1, 2021 | Machine TranslationSpeech-to-Text | —Unverified | 0 |
| Fast Labeling and Transcription with the Speechalyzer Toolkit | May 1, 2012 | Audio ClassificationBenchmarking | —Unverified | 0 |
| Impact of Microphone position Measurement Error on Multi Channel Distant Speech Recognition & Intelligibility | Dec 1, 2021 | Distant Speech RecognitionPosition | —Unverified | 0 |
| Improved Cross-Lingual Transfer Learning For Automatic Speech Translation | Jun 1, 2023 | automatic-speech-translationCross-Lingual Transfer | —Unverified | 0 |
| Improve Sinhala Speech Recognition Through e2e LF-MMI Model | Dec 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Improving Autoregressive NLP Tasks via Modular Linearized Attention | Apr 17, 2023 | Computational EfficiencyMachine Translation | —Unverified | 0 |
| Attention-Based End-to-End Speech Recognition on Voice Search | Jul 22, 2017 | DecoderL2 Regularization | —Unverified | 0 |
| Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech | Aug 10, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders | Sep 14, 2023 | Contrastive LearningKnowledge Distillation | —Unverified | 0 |
| Extending RNN-T-based speech recognition systems with emotion and language classification | Jul 28, 2022 | Emotion ClassificationEmotion Recognition | —Unverified | 0 |
| Improving Metrics for Speech Translation | May 22, 2023 | Speech-to-TextTranslation | —Unverified | 0 |
| Improving RNN-Transducers with Acoustic LookAhead | Jul 11, 2023 | HallucinationSpeech-to-Text | —Unverified | 0 |
| AI-Powered Immersive Assistance for Interactive Task Execution in Industrial Environments | Jul 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit | Mar 26, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task | Jul 12, 2021 | DecoderKnowledge Distillation | —Unverified | 0 |
| Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach | Oct 6, 2023 | Simultaneous Speech-to-Text TranslationSpeech-to-Text | —Unverified | 0 |
| Exploring Transfer Learning For End-to-End Spoken Language Understanding | Dec 15, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| IMS-Speech: A Speech to Text Tool | Aug 13, 2019 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset | Jun 15, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems | Mar 10, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation | Oct 11, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks | Aug 25, 2022 | Machine TranslationPart-Of-Speech Tagging | —Unverified | 0 |
| Multilingual Speech Translation with Efficient Finetuning of Pretrained Models | Oct 24, 2020 | Cross-Lingual TransferDecoder | —Unverified | 0 |
| Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing | Sep 27, 2023 | DecoderMachine Translation | —Unverified | 0 |
| LASER: Attention with Exponential Transformation | Nov 5, 2024 | Speech-to-Text | —Unverified | 0 |
| Interpreting Strategies Annotation in the WAW Corpus | Sep 1, 2017 | Machine TranslationSpeech-to-Text | —Unverified | 0 |
| Investigating Decoder-only Large Language Models for Speech-to-text Translation | Jul 3, 2024 | Decoderparameter-efficient fine-tuning | —Unverified | 0 |
| Existential Crisis: A Social Robot's Reason for Being | Jan 6, 2025 | Speech-to-Text | —Unverified | 0 |
| Evaluation of real-time transcriptions using end-to-end ASR models | Sep 9, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages | Nov 11, 2024 | DecoderMachine Translation | —Unverified | 0 |
| I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs | Jun 17, 2025 | 3D visual groundingContrastive Learning | —Unverified | 0 |
| CMU's IWSLT 2024 Simultaneous Speech Translation System | Aug 14, 2024 | DecoderSpeech-to-Text | —Unverified | 0 |
| Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks | Jul 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates | Nov 8, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |