SOTAVerified

Automatic Speech Recognition

Papers

Showing 14511500 of 3174 papers

TitleStatusHype
H_eval: A new hybrid evaluation metric for automatic speech recognition tasks0
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system0
Streaming Audio-Visual Speech Recognition with Alignment Regularization0
Leveraging Domain Features for Detecting Adversarial Attacks Against Deep Speech Recognition in Noise0
Probing Statistical Representations For End-To-End ASR0
Monolingual Recognizers Fusion for Code-switching Speech Recognition0
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss0
More Speaking or More Speakers?0
Towards Zero-Shot Code-Switched Speech Recognition0
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder0
Mandarin-English Code-Switching Speech Recognition System for Specific Domain0
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings0
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings0
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems0
A Preliminary Study on Automated Speaking Assessment of English as a Second Language (ESL) Students0
An analysis of degenerating speech due to progressive dysarthria on ASR performance0
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation0
DiaCorrect: End-to-end error correction for speaker diarizationCode0
Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings0
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition0
Blank Collapse: Compressing CTC emission for the faster decodingCode0
Structured State Space Decoder for Speech Recognition and Synthesis0
DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common Label Set0
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili0
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition0
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition0
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance0
On Out-of-Distribution Detection for Audio with Deep Nearest NeighborsCode0
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition0
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time Voice Anonymization0
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation0
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech0
Contextual-Utterance Training for Automatic Speech Recognition0
TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection0
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition0
SAN: a robust end-to-end ASR model architecture0
Simulating realistic speech overlaps improves multi-talker ASR0
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptationCode0
Monotonic segmental attention for automatic speech recognition0
Smart Speech Segmentation using Acousto-Linguistic Features with look-ahead0
UFO2: A unified pre-training framework for online and offline speech recognition0
Reducing Language confusion for Code-switching Speech Recognition with Token-level Language DiarizationCode0
Four-in-One: A Joint Approach to Inverse Text Normalization, Punctuation, Capitalization, and Disfluency for Automatic Speech Recognition0
End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English0
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR0
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition0
Does Joint Training Really Help Cascaded Speech Translation?Code0
Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla0
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition0
Guided contrastive self-supervised pre-training for automatic speech recognition0
Show:102550
← PrevPage 30 of 64Next →

No leaderboard results yet.