SOTAVerified

Automatic Speech Recognition

Papers

Showing 151200 of 3174 papers

TitleStatusHype
ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact CentersCode1
Common Voice: A Massively-Multilingual Speech CorpusCode1
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy SpeechCode1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
A context-aware knowledge transferring strategy for CTC-based ASRCode1
Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech RecognitionCode1
A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applicationsCode1
Brazilian Portuguese Speech Recognition Using Wav2vec 2.0Code1
Can Contextual Biasing Remain Effective with Whisper and GPT-2?Code1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
Advancing Test-Time Adaptation in Wild Acoustic Test SettingsCode1
Cross Attention Augmented Transducer Networks for Simultaneous TranslationCode1
Accented Speech Recognition With Accent-specific CodebooksCode1
CTC-synchronous Training for Monotonic Attention ModelCode1
Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across ModalitiesCode1
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech RecognitionCode1
Can we use Common Voice to train a Multi-Speaker TTS system?Code1
DiaCorrect: Error Correction Back-end For Speaker DiarizationCode1
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithmCode1
Back Translation for Speech-to-text Translation Without TranscriptsCode1
BembaSpeech: A Speech Recognition Corpus for the Bemba LanguageCode1
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific ExpertsCode1
Dompteur: Taming Audio Adversarial ExamplesCode1
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech TranslationCode1
AVLnet: Learning Audio-Visual Language Representations from Instructional VideosCode1
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognitionCode1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker ExtractionCode1
End-to-end Named Entity Recognition from English SpeechCode1
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationCode1
AV Taris: Online Audio-Visual Speech RecognitionCode1
AISHELL-NER: Named Entity Recognition from Chinese SpeechCode1
ESB: A Benchmark For Multi-Domain End-to-End Speech RecognitionCode1
Espresso: A Fast End-to-end Neural Speech Recognition ToolkitCode1
ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of KaldiCode1
Extending Whisper with prompt tuning to target-speaker ASRCode1
BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG dataCode1
CB-Conformer: Contextual biasing Conformer for biased word recognitionCode1
Continuous speech separation: dataset and analysisCode1
Radically Old Way of Computing Spectra: Applications in End-to-End ASRCode1
Automatic Speech Recognition for Speech Assessment of Persian Preschool ChildrenCode1
Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech RecognitionCode1
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control CommunicationsCode1
HowToCaption: Prompting LLMs to Transform Video Annotations at ScaleCode1
Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer LearningCode1
Improved DeepFake Detection Using Whisper FeaturesCode1
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion EncoderCode1
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language ModelCode1
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data AugmentationCode1
Automatic Speech Recognition Benchmark for Air-Traffic CommunicationsCode1
Show:102550
← PrevPage 4 of 64Next →

No leaderboard results yet.