SOTAVerified

Automatic Speech Recognition

Papers

Showing 851900 of 3174 papers

TitleStatusHype
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition0
HTEC: Human Transcription Error Correction0
Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models0
Training dynamic models using early exits for automatic speech recognition on resource-constrained devicesCode0
HypR: A comprehensive study for ASR hypothesis revising with a reference corpusCode1
Investigating End-to-End ASR Architectures for Long Form Audio Transcription0
Instruction-Following Speech Recognition0
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation0
A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting0
Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning0
Enhancing Quantised End-to-End ASR Models via PersonalisationCode0
Improving Speech Recognition for African American English With Audio Classification0
Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation0
Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints0
Transformer Based Punctuation Restoration for TurkishCode0
t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability0
DiaCorrect: Error Correction Back-end For Speaker DiarizationCode1
Unimodal Aggregation for CTC-based Speech RecognitionCode1
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network0
Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription0
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks0
Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation0
CPPF: A contextual and post-processing-free model for automatic speech recognition0
PromptASR for contextualized ASR with controllable styleCode2
EnCodecMAE: Leveraging neural codecs for universal audio representation learningCode1
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech CodecCode2
Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis0
Open-vocabulary Keyword-spotting with Adaptive Instance Normalization0
Can Whisper perform speech-based in-context learning?0
Improving Robustness of Neural Inverse Text Normalization via Data-Augmentation, Semi-Supervised Learning, and Post-Aligning Method0
Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults0
Hybrid ASR for Resource-Constrained Robots: HMM - Deep Learning FusionCode0
Leveraging Large Language Models for Exploiting ASR Uncertainty0
Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR EvaluationCode0
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems0
LanSER: Language-Model Supported Speech Emotion Recognition0
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models0
Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition0
AVATAR: Robust Voice Search Engine Leveraging Autoregressive Document Retrieval and Contrastive Learning0
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation0
Contextual Biasing of Named-Entities with Large Language Models0
Learning Speech Representation From Contrastive Token-Acoustic Pretraining0
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer0
ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers0
Adapting Text-based Dialogue State Tracker for Spoken Dialogues0
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition0
Neural approaches to spoken content embedding0
Decoupled Structure for Improved Adaptability of End-to-End Models0
A Small and Fast BERT for Chinese Medical Punctuation RestorationCode0
SeamlessM4T: Massively Multilingual & Multimodal Machine TranslationCode2
Show:102550
← PrevPage 18 of 64Next →

No leaderboard results yet.