SOTAVerified

Speech-to-Text Translation

Translate audio signals of speech in one language into text in a foreign language, either in an end-to-end or cascade manner.

Papers

Showing 125 of 146 papers

TitleStatusHype
End-to-End Speech Translation for Low-Resource Languages Using Weakly Labeled Data0
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation0
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios0
Improving Language and Modality Transfer in Translation by Character-level Modeling0
BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation SystemCode0
Audio Jailbreak Attacks: Exposing Vulnerabilities in SpeechGPT in a White-Box FrameworkCode1
MEDIBENG WHISPER TINY: A FINE-TUNED CODE-SWITCHED BENGALI-ENGLISH TRANSLATOR FOR CLINICAL APPLICATIONSCode1
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation0
Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM0
SparQLe: Speech Queries to Text Translation Through LLMsCode0
Speech to Speech Translation with Translatotron: A State of the Art Review0
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?0
Representation Purification for End-to-End Speech Translation0
Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages0
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?0
A Survey on Speech Large Language Models0
Contextual Biasing to Improve Domain-specific Custom Vocabulary Audio Transcription without Explicit Fine-Tuning of Whisper Model0
Unveiling the Role of Pretraining in Direct Speech Translation0
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language ModelsCode1
CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation UnitsCode0
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language ModelsCode0
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation0
Show:102550
← PrevPage 1 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Task Modulation + Multitask Learning(ASR/MT) + Data AugmentationCase-sensitive sacreBLEU28.88Unverified
2Wav2Vec2.0+mBART+AdaptorsCase-sensitive sacreBLEU28.22Unverified
3Transformer + Meta Learning(ASR/MT) + Data AugmentationCase-sensitive sacreBLEU27.51Unverified
4Transformer with AdaptersCase-sensitive sacreBLEU24.63Unverified
5Dual-decoder TransformerCase-sensitive sacreBLEU23.63Unverified
6SpeechformerCase-sensitive sacreBLEU23.6Unverified
7Transformer + ASR PretrainCase-sensitive sacreBLEU22.8Unverified
8Transformer + ASR PretrainCase-sensitive sacreBLEU22.7Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer with AdaptersCase-sensitive sacreBLEU28.73Unverified
2SpeechformerCase-sensitive sacreBLEU28.5Unverified
3Dual-decoder TransformerCase-sensitive sacreBLEU28.12Unverified
4Transformer + ASR Pretrain + SpecAugCase-sensitive sacreBLEU27.4Unverified
5Transformer + ASR PretrainCase-sensitive sacreBLEU26.8Unverified
#ModelMetricClaimedVerifiedStatus
1Dual-decoder TransformerCase-sensitive sacreBLEU33.45Unverified
2Transformer + ASR Pretrain + SpecAugCase-sensitive sacreBLEU33.3Unverified
3Transformer + ASR PretrainCase-sensitive sacreBLEU32.3Unverified
#ModelMetricClaimedVerifiedStatus
1SeamlessM4T LargeBLEU30.6Unverified
2SeamlessM4T MediumBLEU26.6Unverified
#ModelMetricClaimedVerifiedStatus
1SeamlessM4T LargeBLEU34.1Unverified
2SeamlessM4T MediumBLEU29.8Unverified
#ModelMetricClaimedVerifiedStatus
1SeamlessM4T LargeBLEU21.5Unverified
2SeamlessM4T MediumBLEU19.2Unverified
#ModelMetricClaimedVerifiedStatus
1SeamlessM4T LargeBLEU24Unverified
2SeamlessM4T MediumBLEU20.9Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer + ASR Pretrain + SpecAugCase-insensitive sacreBLEU17.2Unverified
2Transformer + ASR PretrainCase-insensitive sacreBLEU16.5Unverified
#ModelMetricClaimedVerifiedStatus
1MediBeng Whisper TinyBleu0.98Unverified
2Whisper TinyBleu0.3Unverified
#ModelMetricClaimedVerifiedStatus
1Transformer with AdaptersSacreBLEU26.61Unverified
2Dual-decoder TransformerSacreBLEU25.62Unverified
#ModelMetricClaimedVerifiedStatus
1SpeechformerCase-sensitive sacreBLEU27.7Unverified