SOTAVerified

Automatic Speech Recognition

Papers

Showing 151200 of 3174 papers

TitleStatusHype
Exploring Gender Disparities in Automatic Speech Recognition Technology0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM0
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation0
Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration0
The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages0
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders0
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models0
Measuring the Effect of Transcription Noise on Downstream Language Understanding TasksCode0
Adopting Whisper for Confidence Estimation0
Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models0
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization0
Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics0
Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders0
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming CapabilitiesCode1
Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge0
MTLM: Incorporating Bidirectional Text Information to Enhance Language Model Training in Speech Recognition Systems0
Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors0
VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR IdentificationCode1
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation ModelsCode1
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance0
Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance0
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers0
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond0
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and SubtitlingCode0
Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition0
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition0
A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport0
Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition0
Sagalee: an Open Source Automatic Speech Recognition Dataset for Oromo LanguageCode1
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation0
Language Bias in Self-Supervised Learning For Automatic Speech Recognition0
SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions0
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition0
Classification Error Bound for Low Bayes Error Conditions in Machine Learning0
SEAL: Speech Embedding Alignment Learning for Speech Large Language Model with Retrieval-Augmented Generation0
The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?0
Speech Translation Refinement using Large Language ModelsCode0
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM IntegrationCode5
LoCoML: A Framework for Real-World ML Inference Pipelines0
Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing0
FlanEC: Exploring Flan-T5 for Post-ASR Error CorrectionCode1
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor ContractionsCode0
Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio0
Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges0
GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems0
A Benchmark of French ASR Systems Based on Error Severity0
Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR0
Automatic Speech Recognition for Sanskrit with Transfer Learning0
Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition0
PIER: A Novel Metric for Evaluating What Matters in Code-SwitchingCode0
Show:102550
← PrevPage 4 of 64Next →

No leaderboard results yet.