SOTAVerified

Automatic Speech Recognition

Papers

Showing 251300 of 3174 papers

TitleStatusHype
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data AugmentationCode1
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languagesCode1
indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languagesCode1
Incorporating External POS Tagger for Punctuation RestorationCode1
Automatic Speech Recognition for Speech Assessment of Persian Preschool ChildrenCode1
Large-Scale Streaming End-to-End Speech Translation with Neural TransducersCode1
Learning to Count Words in Fluent Speech enables Online Speech RecognitionCode1
KoSpeech: Open-Source Toolkit for End-to-End Korean Speech RecognitionCode1
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical DistillationCode1
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and SyllablesCode1
A Survey on Non-Autoregressive Generation for Neural Machine Translation and BeyondCode1
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and EnglishCode1
A Systematic Comparison of Phonetic Aware Techniques for Speech EnhancementCode1
ASR Error Correction with Constrained Decoding on Operation PredictionCode1
A Comparison of Methods for OOV-word Recognition on a New Public DatasetCode1
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control CommunicationsCode1
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent ClassificationCode1
A transfer learning based approach for pronunciation scoringCode1
Attention-based Audio-Visual Fusion for Robust Automatic Speech RecognitionCode1
Attention-based Contextual Language Model Adaptation for Speech RecognitionCode1
Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language TextCode1
Audio-Visual Efficient Conformer for Robust Speech RecognitionCode1
An Investigation of End-to-End Models for Robust Speech RecognitionCode1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
LAE: Language-Aware Encoder for Monolingual and Multilingual ASRCode1
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker OneCode1
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionCode1
Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task LearningCode1
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMTCode1
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNetCode1
A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and RecognitionCode1
AVATAR: Unconstrained Audiovisual Speech RecognitionCode1
ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMsCode1
Back Translation for Speech-to-text Translation Without TranscriptsCode1
BembaSpeech: A Speech Recognition Corpus for the Bemba LanguageCode1
Layer-wise Analysis of a Self-supervised Speech Representation ModelCode1
ArTST: Arabic Text and Speech TransformerCode1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech RecognitionCode1
Joint Masked CPC and CTC Training for ASRCode1
Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech TranslationCode1
A Persian ASR-based SER: Modification of Sharif Emotional Speech Database and Investigation of Persian Text CorporaCode1
A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-SupervisionCode1
Brazilian Portuguese Speech Recognition Using Wav2vec 2.0Code1
Advancing Test-Time Adaptation in Wild Acoustic Test SettingsCode1
Can Contextual Biasing Remain Effective with Whisper and GPT-2?Code1
Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker RecordingsCode1
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural NetworksCode1
Dompteur: Taming Audio Adversarial ExamplesCode1
ÌròyìnSpeech: A multi-purpose Yorùbá Speech CorpusCode1
Show:102550
← PrevPage 6 of 64Next →

No leaderboard results yet.