SOTAVerified

Automatic Speech Recognition

Papers

Showing 751800 of 3174 papers

TitleStatusHype
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations0
The USTC-NERCSLIP Systems for The ICMC-ASR Challenge0
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition0
Voices Unheard: NLP Resources and Models for Yorùbá Regional DialectsCode0
Tradition or Innovation: A Comparison of Modern ASR Methods for Forced Alignment0
Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation NetworkCode0
Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over0
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR0
Automatic Speech Recognition for Hindi0
Dynamic Data Pruning for Automatic Speech Recognition0
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research0
Sequential Editing for Lifelong Training of Speech Recognition Models0
FASA: a Flexible and Automatic Speech Aligner for Extracting High-quality Aligned Children Speech DataCode0
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 20240
Decoder-only Architecture for Streaming End-to-end Speech Recognition0
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss0
Perception of Phonological Assimilation by Neural Speech Recognition Models0
PI-Whisper: Designing an Adaptive and Incremental Automatic Speech Recognition System for Edge Devices0
An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks0
Intelligent Interface: Enhancing Lecture Engagement with Didactic Activity Summaries0
ManWav: The First Manchu ASR Model0
Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control0
Transcribe, Align and Segment: Creating speech datasets for low-resource languages0
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model0
Unsupervised Online Continual Learning for Automatic Speech RecognitionCode0
Performant ASR Models for Medical Entities in Accented Speech0
Growing Trees on Sounds: Assessing Strategies for End-to-End Dependency Parsing of SpeechCode0
Automatic Speech Recognition for Biomedical Data in Bengali Language0
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving0
Large Language Models for Dysfluency Detection in Stuttered Speech0
Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition0
Optimized Speculative Sampling for GPU Hardware AcceleratorsCode0
Learning Language Structures through Grounding0
Optimizing Byte-level Representation for End-to-end ASR0
Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation0
ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2.0 Based ASR0
An efficient text augmentation approach for contextualized Mandarin speech recognition0
Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn'tCode0
Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition0
Multi-Modal Retrieval For Large Language Model Based Speech Recognition0
The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments0
Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment0
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related TasksCode0
DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion0
Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data0
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR0
Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion TechniquesCode0
Towards Unsupervised Speech Recognition Without Pronunciation ModelsCode0
Guiding Frame-Level CTC Alignments Using Self-knowledge DistillationCode0
Transformer-based Model for ASR N-Best Rescoring and Rewriting0
Show:102550
← PrevPage 16 of 64Next →

No leaderboard results yet.