SOTAVerified

Automatic Speech Recognition

Papers

Showing 11011150 of 3174 papers

TitleStatusHype
Towards the Universal Defense for Query-Based Audio Adversarial Attacks0
OLISIA: a Cascade System for Spoken Dialogue State TrackingCode0
Security and Privacy Problems in Voice Assistant Applications: A Survey0
CB-Conformer: Contextual biasing Conformer for biased word recognitionCode1
Political corpus creation through automatic speech recognition on EU debatesCode0
Multimodal Short Video Rumor Detection System Based on Contrastive Learning0
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers0
Evaluation of Speaker Anonymization on Emotional Speech0
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition0
Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC100
Speech Reconstruction from Silent Tongue and Lip Articulation By Pseudo Target Generation and Domain Adversarial Training0
Regularizing Contrastive Predictive Coding for Speech Applications0
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR0
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data0
Self-Supervised Learning-Based Source Separation for Meeting Data0
Multilingual Word Error Rate Estimation: e-WER30
Improving the previous state-of-the-art Frisian ASR by fine-tuning XLS-R0
Dialog act guided contextual adapter for personalized speech recognition0
The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR0
PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers0
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR0
Joint unsupervised and supervised learning for context-aware language identification0
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLPCode1
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis0
Auto-AVSR: Audio-Visual Speech Recognition with Automatic LabelsCode2
Enhancing Unsupervised Speech Recognition with Diffusion GANs0
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition0
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition0
Self-supervised Learning with Speech Modulation Dropout0
Transformers in Speech Processing: A Survey0
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations0
Code-Switching Text Generation and Injection in Mandarin-English ASR0
Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition0
A Deep Learning System for Domain-specific Speech Recognition0
Visual Information Matters for ASR Error Correction0
DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model0
Trustera: A Live Conversation Redaction System0
HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanismCode0
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken SentencesCode0
Improving Accented Speech Recognition with Multi-Domain Training0
Improving the Intent Classification accuracy in Noisy Environment0
Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative StudyCode0
Transcription free filler word detection with Neural semi-CRFsCode0
Stabilizing Transformer Training by Preventing Attention Entropy CollapseCode2
MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems0
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings0
wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts0
End-to-End Speech Recognition: A Survey0
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages0
Leveraging Large Text Corpora for End-to-End Speech Summarization0
Show:102550
← PrevPage 23 of 64Next →

No leaderboard results yet.