SOTAVerified

Automatic Speech Recognition

Papers

Showing 25512600 of 3174 papers

TitleStatusHype
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora0
Zipformer: A faster and better encoder for automatic speech recognition0
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities0
100,000 Podcasts: A Spoken English Document Corpus0
ZJU’s IWSLT 2021 Speech Translation System0
Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech0
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain0
Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains0
Towards Better Understanding of Spontaneous Conversations: Overcoming Automatic Speech Recognition Errors With Intent Recognition0
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR0
Transformer-based Cascaded Multimodal Speech Translation0
Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation0
1SPU: 1-step Speech Processing Unit0
Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses0
Towards interfacing large language models with ASR systems using confidence measures and prompting0
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition0
Handling Numeric Expressions in Automatic Speech Recognition0
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation0
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data0
Self-Supervised Learning for Multi-Channel Neural Transducer0
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval0
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder0
Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio0
LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors0
2-bit Conformer quantization for automatic speech recognition0
3-D Feature and Acoustic Modeling for Far-Field Speech Recognition0
4-bit Quantization of LSTM-based Speech Recognition Models0
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders0
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders0
A bandit approach to curriculum generation for automatic speech recognition0
A baseline model for computationally inexpensive speech recognition for Kazakh using the Coqui STT framework0
A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition0
A Benchmark of French ASR Systems Based on Error Severity0
A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions0
Accelerating Transducers through Adjacent Token Merging0
AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition0
Accented Speech Recognition: A Survey0
Accented Speech Recognition Inspired by Human Perception0
AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents0
Accent Recognition with Hybrid Phonetic Features0
Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings0
Accurate and Structured Pruning for Efficient Automatic Speech Recognition0
Accurate synthesis of Dysarthric Speech for ASR data augmentation0
A CLARIN Transcription Portal for Interview Data0
A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection0
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement0
AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive Mixup0
A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives0
A Comparative Analysis of Crowdsourced Natural Language Corpora for Spoken Dialog Systems0
A Comparative Study of LLM-based ASR and Whisper in Low Resource and Code Switching Scenario0
Show:102550
← PrevPage 52 of 64Next →

No leaderboard results yet.