SOTAVerified

Speech-to-Text

Papers

Showing 5175 of 403 papers

TitleStatusHype
Cross Attention Augmented Transducer Networks for Simultaneous TranslationCode1
Information-Transport-based Policy for Simultaneous TranslationCode1
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language ModelsCode1
Benchmarking Large Multimodal Models against Common CorruptionsCode1
One TTS Alignment To Rule Them AllCode1
Regularizing End-to-End Speech Translation with Triangular Decomposition AgreementCode1
Brilla AI: AI Contestant for the National Science and Maths QuizCode1
Kosp2e: Korean Speech to English Translation CorpusCode1
"Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text TranslationCode1
Late reverberation suppression using U-netsCode1
ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMsCode1
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned ProportionsCode1
A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and EditingCode1
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text TranslationCode1
CoVoST 2 and Massively Multilingual Speech-to-Text TranslationCode1
CoVoST: A Diverse Multilingual Speech-To-Text Translation CorpusCode1
Careless Whisper: Speech-to-Text Hallucination HarmsCode0
Calibrated SVM for Probabilistic Classification of In-Vehicle Voices into Vehicle Commands via Voice-to-Text LLM TransformationCode0
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task LearningCode0
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak SupervisionCode0
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker PrivacyCode0
Kurdish (Sorani) Speech to Text: Presenting an Experimental DatasetCode0
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation SystemCode0
Infusing Future Information into Monotonic Attention Through Language ModelsCode0
Show:102550
← PrevPage 3 of 17Next →

No leaderboard results yet.