SOTAVerified

Automatic Speech Recognition

Papers

Showing 9511000 of 3174 papers

TitleStatusHype
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework0
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction0
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition0
Activity focused Speech Recognition of Preschool Children in Early Childhood Classrooms0
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation0
A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection0
Distilling the Knowledge of BERT for CTC-based ASR0
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition0
Convolutional Speech Recognition with Pitch and Voice Quality Features0
Convoifilter: A case study of doing cocktail party speech recognition0
Distributed Deep Learning Strategies For Automatic Speech Recognition0
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition0
DNCASR: End-to-End Training for Speaker-Attributed ASR0
DNN-Based Multilingual Automatic Speech Recognition for Wolaytta using Oromo Speech0
DNN-Based Semantic Model for Rescoring N-best Speech Recognition List0
Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding0
Conversational Speech Recognition Needs Data? Experiments with Austrian German0
Attention-based ASR with Lightweight and Dynamic Convolutions0
Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study0
Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?0
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?0
Does Whisper understand Swiss German? An automatic, qualitative, and human evaluation0
Alignment Restricted Streaming Recurrent Neural Network Transducer0
Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models0
Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems0
Effective Cross-Utterance Language Modeling for Conversational Speech Recognition0
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation0
Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation0
Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters0
Do We Still Need Automatic Speech Recognition for Spoken Language Understanding?0
Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio0
DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement0
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR0
Driving ROVER with Segment-based ASR Quality Estimation0
An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions0
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition0
Conversational Speech Recognition By Learning Conversation-level Characteristics0
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems0
Dual Language Models for Code Switched Speech Recognition0
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition0
A CTC Triggered Siamese Network with Spatial-Temporal Dropout for Speech Recognition0
Dual Script E2E framework for Multilingual and Code-Switching ASR0
Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection0
Contribution \`a l'\'etude de la variabilit\'e de la voix des personnes \^ag\'ees en reconnaissance automatique de la parole (Contribution to the study of elderly people's voice variability in automatic speech recognition) [in French]0
A Transfer Learning Method for Speech Emotion Recognition from Automatic Speech Recognition0
DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input0
Contrastive Semi-supervised Learning for ASR0
Learning Video Representations using Contrastive Bidirectional Transformer0
Alignment-Free Training for Transducer-based Multi-Talker ASR0
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR0
Show:102550
← PrevPage 20 of 64Next →

No leaderboard results yet.