SOTAVerified

Automatic Speech Recognition

Papers

Showing 201250 of 3174 papers

TitleStatusHype
Improved Noisy Student Training for Automatic Speech RecognitionCode1
Deep Sparse Conformer for Speech RecognitionCode1
Attention-based Audio-Visual Fusion for Robust Automatic Speech RecognitionCode1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
DiaCorrect: Error Correction Back-end For Speaker DiarizationCode1
MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-FormulaCode1
AVLnet: Learning Audio-Visual Language Representations from Instructional VideosCode1
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain AdaptationCode1
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion EncoderCode1
Improving Self-supervised Pre-training using Accent-Specific CodebooksCode1
Distilling a Pretrained Language Model to a Multilingual ASR ModelCode1
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific ExpertsCode1
Adaptation of Whisper models to child speech recognitionCode1
Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech RecognitionCode1
Integrating Lattice-Free MMI into End-to-End Speech RecognitionCode1
A Survey on Non-Autoregressive Generation for Neural Machine Translation and BeyondCode1
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech TranslationCode1
Dompteur: Taming Audio Adversarial ExamplesCode1
Adapting End-to-End Speech Recognition for Readable SubtitlesCode1
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question AnsweringCode1
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming CapabilitiesCode1
Earnings-22: A Practical Benchmark for Accents in the WildCode1
A Systematic Comparison of Phonetic Aware Techniques for Speech EnhancementCode1
HowToCaption: Prompting LLMs to Transform Video Annotations at ScaleCode1
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control CommunicationsCode1
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through GradientsCode1
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and EnglishCode1
End-to-end Named Entity Recognition from English SpeechCode1
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control CommunicationsCode1
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech RecognitionCode1
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language ModelsCode1
Espresso: A Fast End-to-end Neural Speech Recognition ToolkitCode1
End-to-End Speech Recognition and Disfluency RemovalCode1
End-to-End Speech Recognition from Federated Acoustic ModelsCode1
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An OverviewCode1
Enhancing Monotonic Multihead Attention for Streaming ASRCode1
ESB: A Benchmark For Multi-Domain End-to-End Speech RecognitionCode1
Punctuation Restoration using Transformer Models for High-and Low-Resource LanguagesCode1
Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech RecognitionCode1
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionCode1
ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of KaldiCode1
Quilt-1M: One Million Image-Text Pairs for HistopathologyCode1
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker OneCode1
Factorized Neural Transducer for Efficient Language Model AdaptationCode1
Fast Development of ASR in African Languages using Self Supervised Speech Representation LearningCode1
Regularizing End-to-End Speech Translation with Triangular Decomposition AgreementCode1
ASR Error Correction with Constrained Decoding on Operation PredictionCode1
How2: A Large-scale Dataset for Multimodal Language UnderstandingCode1
HypR: A comprehensive study for ASR hypothesis revising with a reference corpusCode1
ArTST: Arabic Text and Speech TransformerCode1
Show:102550
← PrevPage 5 of 64Next →

No leaderboard results yet.