SOTAVerified

Speech-to-Text

Papers

Showing 101125 of 403 papers

TitleStatusHype
LibriS2S: A German-English Speech-to-Speech Translation CorpusCode0
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language ModelsCode0
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text TranslationCode0
A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality ConversionCode0
Contextualized Translation of Automatically Segmented SpeechCode0
Kurdish (Sorani) Speech to Text: Presenting an Experimental DatasetCode0
mask-Net: Learning Context Aware Invariant Features using Adversarial Forgetting (Student Abstract)Code0
Audio Adversarial Examples: Targeted Attacks on Speech-to-TextCode0
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech TranslationCode0
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration ApproachCode0
Attentively Embracing Noise for Robust Latent Representation in BERTCode0
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak SupervisionCode0
SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognitionCode0
InstaIndoor and Multi-modal Deep Learning for Indoor Scene RecognitionCode0
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNNCode0
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationCode0
Infusing Future Information into Monotonic Attention Through Language ModelsCode0
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task LearningCode0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text modelsCode0
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning AbstractionsCode0
fairseq S2T: Fast Speech-to-Text Modeling with fairseqCode0
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the WildCode0
End-to-End Automatic Speech Translation of AudiobooksCode0
Show:102550
← PrevPage 5 of 17Next →

No leaderboard results yet.