SOTAVerified

Automatic Speech Recognition

Papers

Showing 5175 of 3174 papers

TitleStatusHype
NusaCrowd: Open Source Initiative for Indonesian NLP ResourcesCode2
BLASER: A Text-Free Speech-to-Speech Translation Evaluation MetricCode2
Towards A Unified Conformer Structure: from ASR to ASV TaskCode2
CMGAN: Conformer-Based Metric-GAN for Monaural Speech EnhancementCode2
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic LearningCode2
Squeezeformer: An Efficient Transformer for Automatic Speech RecognitionCode2
4-bit Conformer with Native Quantization Aware Training for Speech RecognitionCode2
CMGAN: Conformer-based Metric GAN for Speech EnhancementCode2
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster PredictionCode2
Robust Self-Supervised Audio-Visual Speech RecognitionCode2
Fast Transformers with Clustered AttentionCode2
Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across ModalitiesCode1
From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech RecognitionCode1
Whisper-LM: Improving ASR Models with Language Models for Low-Resource LanguagesCode1
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming CapabilitiesCode1
VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverberation and Blind RIR IdentificationCode1
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation ModelsCode1
Sagalee: an Open Source Automatic Speech Recognition Dataset for Oromo LanguageCode1
FlanEC: Exploring Flan-T5 for Post-ASR Error CorrectionCode1
Large Language Models Are Read/Write Policy-Makers for Simultaneous GenerationCode1
MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-FormulaCode1
XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack DetectionCode1
Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-AttentionCode1
VHASR: A Multimodal Speech Recognition System With Vision HotwordsCode1
Mamba for Streaming ASR Combined with Unimodal AggregationCode1
Show:102550
← PrevPage 3 of 127Next →

No leaderboard results yet.