SOTAVerified

Automatic Speech Recognition

Papers

Showing 16011650 of 3174 papers

TitleStatusHype
An Analysis of Semantically-Aligned Speech-Text Embeddings0
End-to-end model for named entity recognition from speech without paired training data0
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation0
PriMock57: A Dataset Of Primary Care Mock ConsultationsCode1
Alternate Intermediate Conditioning with Syllable-level and Character-level Targets for Japanese ASR0
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation0
End-to-End Multi-speaker ASR with Independent Vector Analysis0
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding0
Text-To-Speech Data Augmentation for Low Resource Speech Recognition0
Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition0
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge0
Importance of Different Temporal Modulations of Speech: A Tale of Two Perspectives0
indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languagesCode1
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control CommunicationsCode1
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset0
Memory-Efficient Training of RNN-Transducer with Sampled Softmax0
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings0
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition0
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data0
Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition0
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker ExtractionCode1
Improving Speech Recognition for Indic Languages using Language Model0
Streaming Speaker-Attributed ASR with Token-Level Speaker EmbeddingsCode1
Code Switched and Code Mixed Speech Recognition for Indic languages0
Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech RecognitionCode0
Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?0
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversionCode1
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text NormalizationCode0
4-bit Conformer with Native Quantization Aware Training for Speech RecognitionCode2
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer0
Analysis of EEG frequency bands for Envisioned Speech RecognitionCode0
Earnings-22: A Practical Benchmark for Accents in the WildCode1
Integrating Lattice-Free MMI into End-to-End Speech RecognitionCode1
Frequency-Directional Attention Model for Multilingual Automatic Speech Recognition0
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition Systems0
Short-Term Word-Learning in a Dynamically Changing Environment0
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASRCode1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERTCode1
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing0
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech RecognitionCode1
CMGAN: Conformer-based Metric GAN for Speech EnhancementCode2
Dual-Path Style Learning for End-to-End Noise-Robust Speech RecognitionCode1
Finnish Parliament ASR corpus - Analysis, benchmarks and statisticsCode0
A Dataset for Speech Emotion Recognition in Greek Theatrical PlaysCode0
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech RecognitionCode1
A Speech Representation Anonymization Framework via Selective Noise PerturbationCode0
Speech-enhanced and Noise-aware Networks for Robust Speech RecognitionCode0
Impact of Dataset on Acoustic Models for Automatic Speech Recognition0
Computing Optimal Location of Microphone for Improved Speech Recognition0
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion0
Show:102550
← PrevPage 33 of 64Next →

No leaderboard results yet.