SOTAVerified

Automatic Speech Recognition

Papers

Showing 9761000 of 3174 papers

TitleStatusHype
Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models0
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition0
Bigger is not Always Better: The Effect of Context Size on Speech Pre-TrainingCode0
End-to-End Speech-to-Text Translation: A Survey0
End-to-end Joint Punctuated and Normalized ASR with a Limited Amount of Punctuated Training Data0
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature ExtractorsCode0
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR0
Soft Random Sampling: A Theoretical and Empirical Analysis0
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the WildCode0
How does end-to-end speech recognition training impact speech enhancement artifacts?0
App for Resume-Based Job Matching with Speech Interviews and Grammar Analysis: A Review0
Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition0
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding0
Multi-channel Conversational Speaker Separation via Neural Diarization0
Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer0
Retrieve and Copy: Scaling ASR Personalization to Large Catalogs0
On the Effectiveness of ASR Representations in Real-world Noisy Speech Emotion Recognition0
1SPU: 1-step Speech Processing Unit0
A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognitionCode0
Fine-tuning convergence model in Bengali speech recognition0
Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech RecognitionCode0
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning0
Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants0
RIR-SF: Room Impulse Response Based Spatial Feature for Target Speech Recognition in Multi-Channel Multi-Speaker Scenarios0
Combining Language Models For Specialized Domains: A Colorful Approach0
Show:102550
← PrevPage 40 of 127Next →

No leaderboard results yet.