SOTAVerified

Automatic Speech Recognition

Papers

Showing 201250 of 3174 papers

TitleStatusHype
Integrating Lattice-Free MMI into End-to-End Speech RecognitionCode1
Interactive Feature Fusion for End-to-End Noise-Robust Speech RecognitionCode1
ÌròyìnSpeech: A multi-purpose Yorùbá Speech CorpusCode1
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech RecognitionCode1
Joint Masked CPC and CTC Training for ASRCode1
Kallaama: A Transcribed Speech Dataset about Agriculture in the Three Most Widely Spoken Languages in SenegalCode1
BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG dataCode1
Language and Speech Technology for Central Kurdish VarietiesCode1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
Back Translation for Speech-to-text Translation Without TranscriptsCode1
Large-Scale Streaming End-to-End Speech Translation with Neural TransducersCode1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation ModelsCode1
Learning Audio-Visual DereverberationCode1
Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling InsightsCode1
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from SpeechCode1
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNetCode1
A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and RecognitionCode1
Adapting End-to-End Speech Recognition for Readable SubtitlesCode1
Lightweight Adapter Tuning for Multilingual Speech TranslationCode1
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech RecognitionCode1
Automatic Speech Recognition Benchmark for Air-Traffic CommunicationsCode1
Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task LearningCode1
Automatic Speech Recognition for Speech Assessment of Persian Preschool ChildrenCode1
metaCAT: A Metadata-based Task-oriented Chatbot Annotation ToolCode1
Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASRCode1
MM-ALT: A Multimodal Automatic Lyric Transcription SystemCode1
AVATAR: Unconstrained Audiovisual Speech RecognitionCode1
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation ModelsCode1
MT3: Multi-Task Multitrack Music TranscriptionCode1
Audio-Visual Efficient Conformer for Robust Speech RecognitionCode1
Attention-based Contextual Language Model Adaptation for Speech RecognitionCode1
Neural Predictor for Black-Box Adversarial Attacks on Speech RecognitionCode1
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LMCode1
OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data GenerationCode1
On the Comparison of Popular End-to-End Models for Large Scale Speech RecognitionCode1
Attention-based Audio-Visual Fusion for Robust Automatic Speech RecognitionCode1
Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language TextCode1
Automatic Disfluency Detection from Untranscribed SpeechCode1
Pretraining Techniques for Sequence-to-Sequence Voice ConversionCode1
AVLnet: Learning Audio-Visual Language Representations from Instructional VideosCode1
Punctuation Restoration using Transformer Models for High-and Low-Resource LanguagesCode1
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASRCode1
Integer-only Zero-shot Quantization for Efficient Speech RecognitionCode1
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASRCode1
A Survey on Non-Autoregressive Generation for Neural Machine Translation and BeyondCode1
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and EnglishCode1
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLPCode1
A Systematic Comparison of Phonetic Aware Techniques for Speech EnhancementCode1
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionCode1
Show:102550
← PrevPage 5 of 64Next →

No leaderboard results yet.