Speech-to-Text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 403 papers

Title	Date	Tasks	Status
Audio Adversarial Examples: Attacks Using Vocal Masks	Feb 4, 2021	Adversarial AttackSpeech-to-Text	—Unverified
Comparison of SVD and factorized TDNN approaches for speech to text	Oct 13, 2021	Speech-to-Text	—Unverified
Acquisition of high-quality images for camera calibration in robotics applications via speech prompts	Apr 15, 2025	Camera CalibrationSpeech-to-Text	—Unverified
Compact Speech Translation Models via Discrete Speech Units Pretraining	Feb 29, 2024	DecoderSelf-Supervised Learning	—Unverified
Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks	Aug 25, 2022	Machine TranslationPart-Of-Speech Tagging	—Unverified
Open Brain AI. Automatic Language Assessment	Jun 11, 2023	Speech-to-Text	—Unverified
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers	Apr 21, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Graph Neural Networks to Predict Customer Satisfaction Following Interactions with a Corporate Call Center	Jan 31, 2021	Graph Neural NetworkSpeech-to-Text	—Unverified
Contextual Biasing to Improve Domain-specific Custom Vocabulary Audio Transcription without Explicit Fine-Tuning of Whisper Model	Oct 24, 2024	speech-recognitionSpeech Recognition	—Unverified
Handling and extracting key entities from customer conversations using Speech recognition and Named Entity recognition	Nov 28, 2022	named-entity-recognitionNamed Entity Recognition	—Unverified
Hands-Free VR	Feb 23, 2024	DiversityLanguage Modelling	—Unverified
Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language	May 6, 2022	speech-recognitionSpeech Recognition	—Unverified
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation	Jul 4, 2024	Machine Translationspeech-recognition	—Unverified
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?	Dec 24, 2024	Simultaneous Speech-to-Text TranslationSpeech-to-Text	—Unverified
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not	Sep 25, 2024	Automatic Speech Recognitionspeech-recognition	—Unverified
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks	May 4, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Findings of the Third Workshop on Automatic Simultaneous Translation	Jul 1, 2022	Speech-to-TextTranslation	—Unverified
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks	Jan 18, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Impact of Microphone position Measurement Error on Multi Channel Distant Speech Recognition & Intelligibility	Dec 1, 2021	Distant Speech RecognitionPosition	—Unverified
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation	Jun 1, 2023	automatic-speech-translationCross-Lingual Transfer	—Unverified
Improve Sinhala Speech Recognition Through e2e LF-MMI Model	Dec 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Improving Autoregressive NLP Tasks via Modular Linearized Attention	Apr 17, 2023	Computational EfficiencyMachine Translation	—Unverified
Findings of the Second Workshop on Automatic Simultaneous Translation	Jun 1, 2021	Machine TranslationSpeech-to-Text	—Unverified
Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech	Aug 10, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Fast Labeling and Transcription with the Speechalyzer Toolkit	May 1, 2012	Audio ClassificationBenchmarking	—Unverified
Attention-Based End-to-End Speech Recognition on Voice Search	Jul 22, 2017	DecoderL2 Regularization	—Unverified
Improving Metrics for Speech Translation	May 22, 2023	Speech-to-TextTranslation	—Unverified
Improving RNN-Transducers with Acoustic LookAhead	Jul 11, 2023	HallucinationSpeech-to-Text	—Unverified
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders	Sep 14, 2023	Contrastive LearningKnowledge Distillation	—Unverified
Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit	Mar 26, 2025	speech-recognitionSpeech Recognition	—Unverified
Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task	Jul 12, 2021	DecoderKnowledge Distillation	—Unverified
Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach	Oct 6, 2023	Simultaneous Speech-to-Text TranslationSpeech-to-Text	—Unverified
Extending RNN-T-based speech recognition systems with emotion and language classification	Jul 28, 2022	Emotion ClassificationEmotion Recognition	—Unverified
IMS-Speech: A Speech to Text Tool	Aug 13, 2019	speech-recognitionSpeech Recognition	—Unverified
AI-Powered Immersive Assistance for Interactive Task Execution in Industrial Environments	Jul 12, 2024	Language ModelingLanguage Modelling	—Unverified
Exploring Transfer Learning For End-to-End Spoken Language Understanding	Dec 15, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset	Jun 15, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems	Mar 10, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Infusing Future Information into Monotonic Attention Through Language Models	Sep 7, 2021	Language ModelingLanguage Modelling	—Unverified
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models	Oct 24, 2020	Cross-Lingual TransferDecoder	—Unverified
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation	Oct 11, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Interpreting Strategies Annotation in the WAW Corpus	Sep 1, 2017	Machine TranslationSpeech-to-Text	—Unverified
Investigating Decoder-only Large Language Models for Speech-to-text Translation	Jul 3, 2024	Decoderparameter-efficient fine-tuning	—Unverified
Jointly Trained Transformers models for Spoken Language Translation	Apr 25, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Language Model Augmented Monotonic Attention for Simultaneous Translation	Jul 1, 2022	Language ModelingLanguage Modelling	—Unverified
Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages	Nov 11, 2024	DecoderMachine Translation	—Unverified
I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs	Jun 17, 2025	3D visual groundingContrastive Learning	—Unverified
Existential Crisis: A Social Robot's Reason for Being	Jan 6, 2025	Speech-to-Text	—Unverified
Evaluation of real-time transcriptions using end-to-end ASR models	Sep 9, 2024	Action DetectionActivity Detection	—Unverified
CMU's IWSLT 2024 Simultaneous Speech Translation System	Aug 14, 2024	DecoderSpeech-to-Text	—Unverified

Show:10 25 50

← PrevPage 4 of 9Next →

No leaderboard results yet.