SOTAVerified

Speech-to-Text

Papers

Showing 101150 of 403 papers

TitleStatusHype
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking0
Robust Semantic Communications for Speech Transmission0
Brilla AI: AI Contestant for the National Science and Maths QuizCode1
Compact Speech Translation Models via Discrete Speech Units Pretraining0
Direct Punjabi to English speech translation using discrete units0
Hands-Free VR0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language IdentificationCode0
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?0
Pushing the Limits of Zero-shot End-to-End Speech TranslationCode1
Syllable based DNN-HMM Cantonese Speech to Text System0
Careless Whisper: Speech-to-Text Hallucination HarmsCode0
Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data0
Digits micro-model for accurate and secure transactions0
A Case Study on Filtering for End-to-End Speech Translation0
Streaming Sequence Transduction through Dynamic CompressionCode0
Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases0
Benchmarking Large Multimodal Models against Common CorruptionsCode1
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks0
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the WildCode0
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak SupervisionCode0
OAVA: the open audio-visual archives aggregator0
Revisiting the Entropy Semiring for Neural Speech Recognition0
Efficient Monotonic Multihead Attention0
End-to-End Speech-to-Text Translation: A Survey0
Multi-teacher Distillation for Multilingual Spelling Correction0
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning0
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationCode1
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and TranslationCode0
Toward Joint Language Modeling for Speech Units and Text0
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPTCode2
Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach0
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer0
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR0
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing0
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution0
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models0
SpeechAlign: a Framework for Speech Translation Alignment Evaluation0
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders0
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection0
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationCode0
SONAR: Sentence-Level Multimodal and Language-Agnostic RepresentationsCode2
SeamlessM4T: Massively Multilingual & Multimodal Machine TranslationCode2
Towards an AI to Win Ghana's National Science and Maths QuizCode1
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets0
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNNCode0
A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality ConversionCode0
Improving RNN-Transducers with Acoustic LookAhead0
On decoder-only architecture for speech-to-text and large language model integration0
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M0
Show:102550
← PrevPage 3 of 9Next →

No leaderboard results yet.