SOTAVerified

Automatic Speech Recognition

Papers

Showing 13011350 of 3174 papers

TitleStatusHype
Political corpus creation through automatic speech recognition on EU debatesCode0
Multimodal Short Video Rumor Detection System Based on Contrastive Learning0
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers0
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition0
Evaluation of Speaker Anonymization on Emotional Speech0
Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC100
Speech Reconstruction from Silent Tongue and Lip Articulation By Pseudo Target Generation and Domain Adversarial Training0
Regularizing Contrastive Predictive Coding for Speech Applications0
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR0
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data0
Self-Supervised Learning-Based Source Separation for Meeting Data0
Multilingual Word Error Rate Estimation: e-WER30
Dialog act guided contextual adapter for personalized speech recognition0
Improving the previous state-of-the-art Frisian ASR by fine-tuning XLS-R0
The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR0
PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers0
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR0
Joint unsupervised and supervised learning for context-aware language identification0
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis0
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition0
Enhancing Unsupervised Speech Recognition with Diffusion GANs0
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition0
Self-supervised Learning with Speech Modulation Dropout0
Transformers in Speech Processing: A Survey0
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations0
Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition0
Code-Switching Text Generation and Injection in Mandarin-English ASR0
A Deep Learning System for Domain-specific Speech Recognition0
DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model0
Visual Information Matters for ASR Error Correction0
Trustera: A Live Conversation Redaction System0
HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanismCode0
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken SentencesCode0
Improving Accented Speech Recognition with Multi-Domain Training0
Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative StudyCode0
Improving the Intent Classification accuracy in Noisy Environment0
Transcription free filler word detection with Neural semi-CRFsCode0
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings0
MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems0
wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts0
End-to-End Speech Recognition: A Survey0
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages0
Leveraging Large Text Corpora for End-to-End Speech Summarization0
Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition0
N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space0
Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition0
Practice of the conformer enhanced AUDIO-VISUAL HUBERT on Mandarin and English0
Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech RecognitionCode0
Diacritic Recognition Performance in Arabic ASR0
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video0
Show:102550
← PrevPage 27 of 64Next →

No leaderboard results yet.