SOTAVerified

Automatic Speech Recognition

Papers

Showing 12511300 of 3174 papers

TitleStatusHype
Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition0
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review0
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer0
Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer0
On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches0
Introducing Semantics into Speech Encoders0
Towards A Unified Conformer Structure: from ASR to ASV TaskCode2
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple TargetsCode1
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts0
Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation0
The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia ClassificationCode0
A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding0
Adaptive Multi-Corpora Language Model Training for Speech Recognition0
Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition0
Towards Improved Room Impulse Response Estimation for Speech RecognitionCode1
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control CommunicationsCode1
Robust Unstructured Knowledge Access in Conversational Dialogue with ASR ErrorsCode0
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition0
End-to-End Evaluation of a Spoken Dialogue System for Learning Basic Mathematics0
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR0
LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers0
Evaluation of Automated Speech Recognition Systems for Conversational Speech: A Linguistic Perspective0
Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech0
Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion0
Multi-blank Transducers for Speech RecognitionCode1
Biased Self-supervised learning for ASR0
Probing Statistical Representations For End-To-End ASR0
H_eval: A new hybrid evaluation metric for automatic speech recognition tasks0
Streaming Audio-Visual Speech Recognition with Alignment Regularization0
Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise0
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system0
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingCode1
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss0
Towards Zero-Shot Code-Switched Speech Recognition0
Monolingual Recognizers Fusion for Code-switching Speech Recognition0
More Speaking or More Speakers?0
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder0
A Preliminary Study on Automated Speaking Assessment of English as a Second Language (ESL) Students0
Mandarin-English Code-Switching Speech Recognition System for Specific Domain0
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems0
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings0
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings0
An analysis of degenerating speech due to progressive dysarthria on ASR performance0
DiaCorrect: End-to-end error correction for speaker diarizationCode0
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition0
Delay-penalized transducer for low-latency streaming ASRCode3
Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings0
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation0
Blank Collapse: Compressing CTC emission for the faster decodingCode0
Structured State Space Decoder for Speech Recognition and Synthesis0
Show:102550
← PrevPage 26 of 64Next →

No leaderboard results yet.