SOTAVerified

Automatic Speech Recognition

Papers

Showing 201250 of 3174 papers

TitleStatusHype
Adapting Whisper for Regional Dialects: Enhancing Public Services for Vulnerable Populations in the United Kingdom0
persoDA: Personalized Data Augmentation for Personalized ASR0
A Non-autoregressive Model for Joint STT and TTS0
Selective Attention Merging for low resource tasks: A case study of Child ASRCode0
Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications0
Joint Automatic Speech Recognition And Structure Learning For Better Speech UnderstandingCode0
AdaCS: Adaptive Normalization for Enhanced Code-Switching ASRCode0
Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives0
Discrete Speech Unit Extraction via Independent Component AnalysisCode0
A Survey on Spoken Italian Datasets and Corpora0
Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition0
Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics ProcessingCode0
Universal-2-TF: Robust All-Neural Text Formatting for ASR0
Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection0
Deep Learning for Pathological Speech: A Survey0
Samba-ASR: State-Of-The-Art Speech Recognition Leveraging Structured State-Space Models0
Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech RecognitionCode0
Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer0
Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal ModelsCode0
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale0
Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech Recognition0
Large Language Models Are Read/Write Policy-Makers for Simultaneous GenerationCode1
Automatic Text Pronunciation Correlation Generation and Application for Contextual Biasing0
Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages0
Fotheidil: an Automatic Transcription System for the Irish Language0
DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech RecognitionCode2
Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization0
Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization0
Zero-resource Speech Translation and Recognition with LLMs0
UME: Upcycling Mixture-of-Experts for Scalable and Efficient Automatic Speech Recognition0
Enhancing Multilingual ASR for Unseen Languages via Language Embedding Modeling0
Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition0
Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding0
Speech Retrieval-Augmented Generation without Automatic Speech Recognition0
TouchASP: Elastic Automatic Speech Perception that Everyone Can Touch0
MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-FormulaCode1
LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration0
Transcribing and Translating, Fast and Slow: Joint Speech Translation and Recognition0
Streaming Keyword Spotting Boosted by Cross-layer Discrimination ConsistencyCode2
Speak & Improve Challenge 2025: Tasks and Baseline Systems0
Speak & Improve Corpus 2025: an L2 English Speech Corpus for Language Assessment and Feedback0
Transliterated Zero-Shot Domain Adaptation for Automatic Speech Recognition0
Efficient Adaptation of Multilingual Models for Japanese ASRCode0
Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations GenerationCode0
Bilevel Joint Unsupervised and Supervised Training for Automatic Speech Recognition0
Harnessing Transfer Learning from Swahili: Advancing Solutions for Comorian Dialects0
Leveraging Prompt Learning and Pause Encoding for Alzheimer's Disease Detection0
Effective Text Adaptation for LLM-based ASR through Soft Prompt Fine-Tuning0
Show:102550
← PrevPage 5 of 64Next →

No leaderboard results yet.