SOTAVerified

Automatic Speech Recognition

Papers

Showing 701725 of 3174 papers

TitleStatusHype
DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement0
Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation0
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance0
Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning0
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing0
Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text0
HydraFormer: One Encoder For All Subsampling RatesCode0
Preserving spoken content in voice anonymisation with character-level vocoder conditioningCode0
MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical Expressions into LaTeX Formulas for Improved Readability0
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval0
Self-Supervised Learning for Multi-Channel Neural Transducer0
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion0
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data0
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation0
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition0
Towards interfacing large language models with ASR systems using confidence measures and prompting0
Leveraging Self-Supervised Models for Automatic Whispered Speech RecognitionCode0
Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses0
On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures0
Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions0
Scaling A Simple Approach to Zero-Shot Speech Recognition0
A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives0
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization0
Quantifying the Role of Textual Predictability in Automatic Speech Recognition0
Trading Devil Final: Backdoor attack via Stock market and Bayesian Optimization0
Show:102550
← PrevPage 29 of 127Next →

No leaderboard results yet.