SOTAVerified

Automatic Speech Recognition

Papers

Showing 10011050 of 3174 papers

TitleStatusHype
SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy MinimizationCode1
Explainability of Speech Recognition Transformers via Gradient-based Attention VisualizationCode0
Streaming Speech-to-Confusion Network Speech Recognition0
Can Contextual Biasing Remain Effective with Whisper and GPT-2?Code1
Improved DeepFake Detection Using Whisper FeaturesCode1
Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation0
Audio-Visual Speech Enhancement with Score-Based Generative Models0
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset0
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts0
Encoder-decoder multimodal speaker change detection0
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili0
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication0
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home0
AfriNames: Most ASR models "butcher" African Names0
SlothSpeech: Denial-of-service Attack Against Speech Recognition ModelsCode0
Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning0
Strategies for improving low resource speech to text translation relying on pre-trained ASR models0
Accurate and Structured Pruning for Efficient Automatic Speech Recognition0
VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition0
Zero-Shot Automatic Pronunciation Assessment0
Towards Selection of Text-to-speech Data to Augment ASR Training0
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers0
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer GeneratorCode0
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions0
Building Accurate Low Latency ASR for Streaming Voice Search0
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation0
A Hierarchical Context-aware Modeling Approach for Multi-aspect and Multi-granular Pronunciation Assessment0
Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition0
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target0
CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice0
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction0
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distributionCode0
2-bit Conformer quantization for automatic speech recognition0
INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition0
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator0
Improving Scheduled Sampling for Neural Transducer-based ASR0
Mixture-of-Expert Conformer for Streaming Multilingual ASR0
Svarah: Evaluating English ASR Systems on Indian Accents0
ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition0
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition0
Textless Speech-to-Speech Translation With Limited Parallel DataCode0
Iteratively Improving Speech Recognition and Voice Conversion0
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation0
Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding0
Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person0
On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications0
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR0
Personalized Predictive ASR for Latency Reduction in Voice Assistants0
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers0
TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition0
Show:102550
← PrevPage 21 of 64Next →

No leaderboard results yet.