Speech-to-Text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 403 papers

Title	Date	Tasks	Status
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020	May 24, 2020	Data AugmentationDecoder	—Unverified
Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility	Feb 5, 2022	Speech EnhancementSpeech-to-Text	—Unverified
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification	Feb 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction	Feb 10, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling	Jun 21, 2021	speech-recognitionSpeech Recognition	—Unverified
AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks	Oct 21, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M	Jul 6, 2023	Speech-to-Text	—Unverified
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection	Sep 13, 2023	Adversarial AttackSpeech-to-Text	—Unverified
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili	Oct 29, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Polish Read Speech Corpus for Speech Tools and Services	Jun 1, 2017	Action DetectionActivity Detection	—Unverified
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison	Jan 4, 2025	DecoderKnowledge Distillation	—Unverified
Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases	Feb 1, 2024	speech-recognitionSpeech Recognition	—Unverified
Punctuation restoration in Swedish through fine-tuned KB-BERT	Feb 14, 2022	Language ModellingPunctuation Restoration	—Unverified
Pushing the performances of ASR models on English and Spanish accents	Dec 22, 2022	Speech-to-Text	—Unverified
Recent Advances in Direct Speech-to-text Translation	Jun 20, 2023	Data AugmentationDecoder	—Unverified
Representation Purification for End-to-End Speech Translation	Dec 5, 2024	Machine TranslationRhythm	—Unverified
Revisiting End-to-End Speech-to-Text Translation From Scratch	Jun 9, 2022	Decoderspeech-recognition	—Unverified
Revisiting the Entropy Semiring for Neural Speech Recognition	Dec 13, 2023	speech-recognitionSpeech Recognition	—Unverified
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking	Mar 13, 2024	Chinese Spell CheckingIn-Context Learning	—Unverified
Robust Semantic Communications for Speech Transmission	Mar 8, 2024	Generative Adversarial NetworkSemantic Communication	—Unverified
Role of Intonation in Scoring Spoken English	Aug 23, 2018	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks	Jul 14, 2022	Speech-to-Text	—Unverified
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation	Jun 11, 2025	Reading ComprehensionSpeech Synthesis	—Unverified
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation	Oct 13, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation	May 17, 2022	Representation LearningRetrieval	—Unverified
Self-Supervised Representations Improve End-to-End Speech Translation	Jun 22, 2020	Cross-Lingual Transferspeech-recognition	—Unverified
Semantic-aware Speech to Text Transmission with Redundancy Removal	Feb 7, 2022	Semantic CommunicationSpeech-to-Text	—Unverified
Semantic MIMO Systems for Speech-to-Text Transmission	May 13, 2024	Semantic CommunicationSpeech-to-Text	—Unverified
Semantic-preserved Communication System for Highly Efficient Speech Transmission	May 25, 2022	Semantic Communicationspeech-recognition	—Unverified
Simple and Effective Unsupervised Speech Translation	Oct 18, 2022	Domain AdaptationMachine Translation	—Unverified
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation	Jun 20, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified
SimulSpeech: End-to-End Simultaneous Speech to Text Translation	Jul 1, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver Gaze Zone Estimation Dataset	Apr 13, 2020	Gaze PredictionSpeech-to-Text	—Unverified
Speaker Independent Continuous Speech to Text Converter for Mobile Application	Jul 19, 2013	Action DetectionActivity Detection	—Unverified
Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction	May 8, 2013	Speech SynthesisSpeech-to-Text	—Unverified
SpeechAlign: a Framework for Speech Translation Alignment Evaluation	Sep 20, 2023	Speech-to-TextSpeech-to-Text Translation	—Unverified
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?	Oct 31, 2024	Rhythmspeech-recognition	—Unverified
Speech Recognition Web Services for Dutch	May 1, 2014	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Speech to Speech Translation with Translatotron: A State of the Art Review	Feb 9, 2025	speech-recognitionSpeech Recognition	—Unverified
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation	May 17, 2020	Computational Efficiencyspeech-recognition	—Unverified
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding	Jun 8, 2023	dialog state trackingLanguage Modeling	—Unverified
Speech-to-Text and Evaluation of Multiple Machine Translation Systems	Sep 1, 2022	Machine TranslationSpeech-to-Text	—Unverified
Speech to text and text to speech recognition systems-Areview	Mar 17, 2018	speech-recognitionSpeech Recognition	—Unverified
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios	May 30, 2025	Cross-Lingual TransferPhoneme Recognition	—Unverified
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?	Feb 19, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation	Nov 3, 2024	speech-recognitionSpeech Recognition	—Unverified
SpiCE: A New Open-Access Corpus of Conversational Bilingual Speech in Cantonese and English	May 1, 2020	SentenceSpeech-to-Text	—Unverified
Strategies for improving low resource speech to text translation relying on pre-trained ASR models	May 31, 2023	Automatic Speech RecognitionDecoder	—Unverified
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection	Jun 10, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions	May 30, 2023	AllAutomatic Speech Recognition	—Unverified

Show:10 25 50

← PrevPage 7 of 9Next →

No leaderboard results yet.