SOTAVerified

Automatic Speech Recognition

Papers

Showing 551600 of 3174 papers

TitleStatusHype
ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correctionCode0
EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based DecodingCode0
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 ChallengeCode0
Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and AlgorithmsCode0
Blank Collapse: Compressing CTC emission for the faster decodingCode0
Does Joint Training Really Help Cascaded Speech Translation?Code0
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasksCode0
End-to-End Open Vocabulary Keyword Search With Multilingual Neural RepresentationsCode0
Discrete Speech Unit Extraction via Independent Component AnalysisCode0
Discovering Phonetic Inventories with Crosslingual Automatic Speech RecognitionCode0
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LMCode0
Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationCode0
Direct Segmentation Models for Streaming Speech TranslationCode0
Did you hear that? Adversarial Examples Against Automatic Speech RecognitionCode0
BERT Attends the Conversation: Improving Low-Resource Conversational ASRCode0
Big model only for hard audios: Sample dependent Whisper model selection for efficient inferencesCode0
Bigger is not Always Better: The Effect of Context Size on Speech Pre-TrainingCode0
Effects of Layer Freezing on Transferring a Speech Recognition System to Under-resourced LanguagesCode0
A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech RecognitionCode0
DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European LanguagesCode0
Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech RecognitionCode0
Bi-Directional Lattice Recurrent Neural Networks for Confidence EstimationCode0
DiaCorrect: End-to-end error correction for speaker diarizationCode0
Detecting Adversarial Examples for Speech Recognition via Uncertainty QuantificationCode0
Deep Learning for Audio Signal ProcessingCode0
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech RecognitionCode0
Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion ModelsCode0
DoCIA: An Online Document-Level Context Incorporation Agent for Speech TranslationCode0
Beyond Levenshtein: Leveraging Multiple Algorithms for Robust Word Error Rate Computations And Granular Error ClassificationsCode0
Error-preserving Automatic Speech Recognition of Young English Learners' LanguageCode0
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTubeCode0
Data augmentation using prosody and false starts to recognize non-native children's speechCode0
Coupled Training of Sequence-to-Sequence Models for Accented Speech RecognitionCode0
Cross-domain Speech Recognition with Unsupervised Character-level Distribution MatchingCode0
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial DomainCode0
Explainability of Speech Recognition Transformers via Gradient-based Attention VisualizationCode0
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech RecognitionCode0
Continual Learning for Monolingual End-to-End Automatic Speech RecognitionCode0
Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language UnderstandingCode0
Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with SubwordsCode0
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition SystemsCode0
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel AudioCode0
Comparison and Analysis of New Curriculum Criteria for End-to-End ASRCode0
Fine-Grained Grounding for Multimodal Speech RecognitionCode0
Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics ProcessingCode0
BehancePR: A Punctuation Restoration Dataset for Livestreaming Video TranscriptCode0
ADIMA: Abuse Detection In Multilingual AudioCode0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
Conditional independence for pretext task selection in Self-supervised speech representation learningCode0
Collecting Resources in Sub-Saharan African Languages for Automatic Speech Recognition: a Case Study of WolofCode0
Show:102550
← PrevPage 12 of 64Next →

No leaderboard results yet.