SOTAVerified

Speech-to-Text

Papers

Showing 125 of 403 papers

TitleStatusHype
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitCode6
High-Fidelity Simultaneous Speech-To-Speech TranslationCode5
OSUM: Advancing Open Speech Understanding Models with Limited Resources in AcademiaCode3
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and TranslationCode3
CVSS Corpus and Massively Multilingual Speech-to-Speech TranslationCode2
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPTCode2
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text TranslationCode2
SeamlessM4T: Massively Multilingual & Multimodal Machine TranslationCode2
Speech Model Pre-training for End-to-End Spoken Language UnderstandingCode2
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech TranslationCode2
SONAR: Sentence-Level Multimodal and Language-Agnostic RepresentationsCode2
DUB: Discrete Unit Back-translation for Speech TranslationCode1
EdiTTS: Score-based Editing for Controllable Text-to-SpeechCode1
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming CapabilitiesCode1
Deep Reinforcement Learning For Sequence to Sequence ModelsCode1
Cross-modal Contrastive Learning for Speech TranslationCode1
Denial-of-Service Poisoning Attacks against Large Language ModelsCode1
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationCode1
Common Voice: A Massively-Multilingual Speech CorpusCode1
CoVoST 2 and Massively Multilingual Speech-to-Text TranslationCode1
Brilla AI: AI Contestant for the National Science and Maths QuizCode1
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text TranslationCode1
Benchmarking Large Multimodal Models against Common CorruptionsCode1
Clotho: An Audio Captioning DatasetCode1
CoVoST: A Diverse Multilingual Speech-To-Text Translation CorpusCode1
Show:102550
← PrevPage 1 of 17Next →

No leaderboard results yet.