SOTAVerified

Automatic Speech Recognition

Papers

Showing 451500 of 3174 papers

TitleStatusHype
Direct Speech to Speech Translation: A Review0
Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems0
Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications0
Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision0
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition0
Exploring Gender Disparities in Automatic Speech Recognition Technology0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM0
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation0
Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration0
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders0
The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages0
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models0
Measuring the Effect of Transcription Noise on Downstream Language Understanding TasksCode0
Adopting Whisper for Confidence Estimation0
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization0
Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics0
Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders0
Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models0
MTLM: Incorporating Bidirectional Text Information to Enhance Language Model Training in Speech Recognition Systems0
Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge0
Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors0
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance0
Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance0
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers0
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond0
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and SubtitlingCode0
Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition0
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition0
A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport0
Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition0
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation0
Language Bias in Self-Supervised Learning For Automatic Speech Recognition0
SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions0
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition0
Classification Error Bound for Low Bayes Error Conditions in Machine Learning0
SEAL: Speech Embedding Alignment Learning for Speech Large Language Model with Retrieval-Augmented Generation0
Speech Translation Refinement using Large Language ModelsCode0
The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?0
LoCoML: A Framework for Real-World ML Inference Pipelines0
Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing0
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor ContractionsCode0
Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio0
Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges0
GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems0
A Benchmark of French ASR Systems Based on Error Severity0
Automatic Speech Recognition for Sanskrit with Transfer Learning0
Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR0
Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition0
PIER: A Novel Metric for Evaluating What Matters in Code-SwitchingCode0
persoDA: Personalized Data Augmentation for Personalized ASR0
Show:102550
← PrevPage 10 of 64Next →

No leaderboard results yet.