SOTAVerified

Automatic Speech Recognition

Papers

Showing 5175 of 3174 papers

TitleStatusHype
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile InstructionsCode2
Large Language Models are Efficient Learners of Noise-Robust Speech RecognitionCode2
4-bit Conformer with Native Quantization Aware Training for Speech RecognitionCode2
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text RecognitionCode2
Fast Transformers with Clustered AttentionCode2
AIR-Bench: Benchmarking Large Audio-Language Models via Generative ComprehensionCode2
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech CodecCode2
DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech RecognitionCode2
Dialectal Coverage And Generalization in Arabic Speech RecognitionCode2
emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface ElectromyographyCode2
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU LanguagesCode2
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global ContextCode1
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMICode1
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy SpeechCode1
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech RecognitionCode1
Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for PolishCode1
Continuous speech separation: dataset and analysisCode1
Combining Frame-Synchronous and Label-Synchronous Systems for Speech RecognitionCode1
Common Voice: A Massively-Multilingual Speech CorpusCode1
ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact CentersCode1
CL-MASR: A Continual Learning Benchmark for Multilingual ASRCode1
Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech RecognitionCode1
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation ModelsCode1
Can Contextual Biasing Remain Effective with Whisper and GPT-2?Code1
Can we use Common Voice to train a Multi-Speaker TTS system?Code1
Show:102550
← PrevPage 3 of 127Next →

No leaderboard results yet.