SOTAVerified

Automatic Speech Recognition

Papers

Showing 751800 of 3174 papers

TitleStatusHype
Retrieve and Copy: Scaling ASR Personalization to Large Catalogs0
On the Effectiveness of ASR Representations in Real-world Noisy Speech Emotion Recognition0
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data AugmentationCode1
1SPU: 1-step Speech Processing Unit0
A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognitionCode0
Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer LearningCode1
Fine-tuning convergence model in Bengali speech recognition0
Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech RecognitionCode0
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning0
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific ExpertsCode1
Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants0
Automatic Disfluency Detection from Untranscribed SpeechCode1
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationCode1
RIR-SF: Room Impulse Response Based Spatial Feature for Target Speech Recognition in Multi-Channel Multi-Speaker Scenarios0
Combining Language Models For Specialized Domains: A Colorful Approach0
Developing a Multilingual Dataset and Evaluation Metrics for Code-Switching: A Focus on Hong Kong's Polylingual DynamicsCode1
Dialect Adaptation and Data Augmentation for Low-Resource ASR: TalTech Systems for the MADASR 2023 Challenge0
DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European LanguagesCode0
CL-MASR: A Continual Learning Benchmark for Multilingual ASRCode1
ArTST: Arabic Text and Speech TransformerCode1
Accented Speech Recognition With Accent-specific CodebooksCode1
Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features0
Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation0
Quantifying the Dialect Gap and its Correlates Across Languages0
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech RecognitionCode0
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation0
Intelligibility prediction with a pretrained noise-robust automatic speech recognition model0
SALMONN: Towards Generic Hearing Abilities for Large Language ModelsCode3
Unintended Memorization in Large ASR Models, and How to Mitigate It0
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System0
Generative error correction for code-switching speech recognition using large language models0
Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition0
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition0
Correction Focused Language Model Training for Speech Recognition0
Zipformer: A faster and better encoder for automatic speech recognition0
VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System0
Detecting Speech Abnormalities with a Perceiver-based Sequence Classifier that Leverages a Universal Speech Model0
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization0
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis0
Large Vocabulary Spontaneous Speech Recognition for Tigrigna0
Advancing Test-Time Adaptation in Wild Acoustic Test SettingsCode1
Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring0
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation0
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text0
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition0
Adapting the adapters for code-switching in multilingual ASRCode0
Acoustic Model Fusion for End-to-end Speech Recognition0
Discriminative Speech Recognition Rescoring with Pre-trained Language Models0
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation0
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech RecognitionCode2
Show:102550
← PrevPage 16 of 64Next →

No leaderboard results yet.