SOTAVerified

Speech-to-Text

Papers

Showing 176200 of 403 papers

TitleStatusHype
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution0
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing0
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models0
SpeechAlign: a Framework for Speech Translation Alignment Evaluation0
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders0
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection0
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationCode0
N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets0
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNNCode0
A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality ConversionCode0
Improving RNN-Transducers with Acoustic LookAhead0
On decoder-only architecture for speech-to-text and large language model integration0
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M0
Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture0
AudioPaLM: A Large Language Model That Can Speak and Listen0
Recent Advances in Direct Speech-to-text Translation0
Open Brain AI. Automatic Language Assessment0
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding0
Towards End-to-end Speech-to-text SummarizationCode0
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation0
Strategies for improving low resource speech to text translation relying on pre-trained ASR models0
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions0
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training0
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation0
Show:102550
← PrevPage 8 of 17Next →

No leaderboard results yet.