SOTAVerified

Speech-to-Text

Papers

Showing 301325 of 403 papers

TitleStatusHype
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 20200
Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification0
PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction0
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling0
AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks0
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M0
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection0
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili0
Polish Read Speech Corpus for Speech Tools and Services0
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison0
Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases0
Punctuation restoration in Swedish through fine-tuned KB-BERT0
Pushing the performances of ASR models on English and Spanish accents0
Recent Advances in Direct Speech-to-text Translation0
Representation Purification for End-to-End Speech Translation0
Revisiting End-to-End Speech-to-Text Translation From Scratch0
Revisiting the Entropy Semiring for Neural Speech Recognition0
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking0
Robust Semantic Communications for Speech Transmission0
Role of Intonation in Scoring Spoken English0
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks0
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation0
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation0
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation0
Show:102550
← PrevPage 13 of 17Next →

No leaderboard results yet.