SOTAVerified

Automatic Speech Recognition

Papers

Showing 551600 of 3174 papers

TitleStatusHype
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech RecognitionCode1
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders0
Error-preserving Automatic Speech Recognition of Young English Learners' LanguageCode0
Enhancing CTC-based speech recognition with diverse modeling units0
Text Injection for Neural Contextual Biasing0
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition0
Keyword-Guided Adaptation of Automatic Speech Recognition0
Efficiently Train ASR Models that Memorize Less and Perform Better with Per-core Clipping0
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision0
Enabling ASR for Low-Resource Languages: A Comprehensive Dataset Creation Approach0
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning0
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities0
Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation0
A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and RecognitionCode1
Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous ClientsCode0
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition0
Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language UnderstandingCode0
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation ModelsCode3
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text RecognitionCode2
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation0
Contextualized Automatic Speech Recognition with Dynamic Vocabulary0
You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish0
FairLENS: Assessing Fairness in Law Enforcement Speech Recognition0
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models0
Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings0
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer0
Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants0
SpeechVerse: A Large-scale Generalizable Audio Language Model0
SoccerNet-Echoes: A Soccer Game Audio Commentary DatasetCode1
Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech0
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation ModelsCode1
Open Implementation and Study of BEST-RQ for Speech Processing0
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition0
Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition0
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source DatasetsCode1
Sequence-to-sequence models in peer-to-peer learning: A practical application0
Efficient Compression of Multitask Multilingual Speech Models0
Improving Membership Inference in ASR Model Auditing with Perturbed Loss Features0
Does Whisper understand Swiss German? An automatic, qualitative, and human evaluation0
Automatic Speech Recognition System-Independent Word Error Rate Estimation0
U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF0
Developing Acoustic Models for Automatic Speech Recognition in Swedish0
Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices0
Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm0
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance0
Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic InformationCode0
Less Peaky and More Accurate CTC Forced Alignment by Label PriorsCode1
Semantically Corrected Amharic Automatic Speech RecognitionCode0
Efficient infusion of self-supervised representations in Automatic Speech Recognition0
Artificial Neural Networks to Recognize Speakers Division from Continuous Bengali Speech0
Show:102550
← PrevPage 12 of 64Next →

No leaderboard results yet.