SOTAVerified

Automatic Speech Recognition

Papers

Showing 926950 of 3174 papers

TitleStatusHype
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks0
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition0
On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement0
Improving ASR Contextual Biasing with Guided Attention0
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription0
Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization0
SeMaScore : a new evaluation metric for automatic speech recognition tasks0
Cascaded Cross-Modal Transformer for Audio-Textual ClassificationCode0
Promptformer: Prompted Conformer Transducer for ASR0
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization0
Transcending Controlled Environments Assessing the Transferability of ASRRobust NLU Models to Real-World Applications0
XLS-R Deep Learning Model for Multilingual ASR on Low- Resource Languages: Indonesian, Javanese, and Sundanese0
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction0
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec20
Useful Blunders: Can Automated Speech Recognition Errors Improve Downstream Dementia Classification?0
Continuously Learning New Words in Automatic Speech Recognition0
Exploratory Evaluation of Speech Content Masking0
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings0
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR0
BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators0
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge0
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition0
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech RepresentationCode0
TeLeS: Temporal Lexeme Similarity Score to Estimate Confidence in End-to-End ASRCode0
Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech RecognitionCode0
Show:102550
← PrevPage 38 of 127Next →

No leaderboard results yet.