SOTAVerified

Speech-to-Text

Papers

Showing 301350 of 403 papers

TitleStatusHype
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 20200
Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification0
PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction0
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling0
AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks0
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M0
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection0
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili0
Polish Read Speech Corpus for Speech Tools and Services0
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison0
Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases0
Punctuation restoration in Swedish through fine-tuned KB-BERT0
Pushing the performances of ASR models on English and Spanish accents0
Recent Advances in Direct Speech-to-text Translation0
Representation Purification for End-to-End Speech Translation0
Revisiting End-to-End Speech-to-Text Translation From Scratch0
Revisiting the Entropy Semiring for Neural Speech Recognition0
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking0
Robust Semantic Communications for Speech Transmission0
Role of Intonation in Scoring Spoken English0
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks0
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation0
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation0
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation0
Self-Supervised Representations Improve End-to-End Speech Translation0
Semantic-aware Speech to Text Transmission with Redundancy Removal0
Semantic MIMO Systems for Speech-to-Text Transmission0
Semantic-preserved Communication System for Highly Efficient Speech Transmission0
Simple and Effective Unsupervised Speech Translation0
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation0
SimulSpeech: End-to-End Simultaneous Speech to Text Translation0
Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver Gaze Zone Estimation Dataset0
Speaker Independent Continuous Speech to Text Converter for Mobile Application0
Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction0
SpeechAlign: a Framework for Speech Translation Alignment Evaluation0
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?0
Speech Recognition Web Services for Dutch0
Speech to Speech Translation with Translatotron: A State of the Art Review0
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation0
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding0
Speech-to-Text and Evaluation of Multiple Machine Translation Systems0
Speech to text and text to speech recognition systems-Areview0
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios0
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?0
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation0
SpiCE: A New Open-Access Corpus of Conversational Bilingual Speech in Cantonese and English0
Strategies for improving low resource speech to text translation relying on pre-trained ASR models0
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection0
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions0
Show:102550
← PrevPage 7 of 9Next →

No leaderboard results yet.