SOTAVerified

Automatic Speech Recognition

Papers

Showing 151175 of 3174 papers

TitleStatusHype
Exploring Gender Disparities in Automatic Speech Recognition Technology0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM0
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation0
Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration0
The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages0
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders0
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models0
Measuring the Effect of Transcription Noise on Downstream Language Understanding TasksCode0
Adopting Whisper for Confidence Estimation0
Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models0
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization0
Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics0
Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders0
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming CapabilitiesCode1
MTLM: Incorporating Bidirectional Text Information to Enhance Language Model Training in Speech Recognition Systems0
Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge0
Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors0
VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR IdentificationCode1
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation ModelsCode1
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance0
Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance0
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers0
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond0
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and SubtitlingCode0
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition0
Show:102550
← PrevPage 7 of 127Next →

No leaderboard results yet.