Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference Sep 26, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data Sep 25, 2023 Automatic Speech Recognition Speech Enhancement
— Unverified 0Connecting Speech Encoder and Large Language Model for ASR Sep 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units Sep 25, 2023 Accented Speech Recognition Language Modeling
— Unverified 0On the Impact of Quantization and Pruning of Self-Supervised Speech Models for Downstream Speech Recognition Tasks "In-the-Wild'' Sep 25, 2023 Data Augmentation Model Compression
— Unverified 0Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data Sep 25, 2023 Speech Recognition Translation
Code Code Available 1Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning Sep 25, 2023 Representation Learning Self-Supervised Learning
Code Code Available 1On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers Sep 25, 2023 Language Modeling Language Modelling
— Unverified 0Human Transcription Quality Improvement Sep 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Cross-modal Alignment with Optimal Transport for CTC-based ASR Sep 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Speech enhancement with frequency domain auto-regressive modeling Sep 24, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0My Science Tutor (MyST) -- A Large Corpus of Children's Conversational Speech Sep 23, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization Sep 22, 2023 Automatic Speech Recognition speaker-diarization
— Unverified 0Affect Recognition in Conversations Using Large Language Models Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Importance of Smoothness Induced by Optimizers in FL4ASR: Towards Understanding Federated Learning for End-to-End ASR Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Massive End-to-end Models for Short Search Queries Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Memory-augmented conformer for improved end-to-end long-form ASR Sep 22, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement Sep 21, 2023 Automatic Speech Recognition Speech Enhancement
— Unverified 0Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling Sep 21, 2023 Classification speech-recognition
— Unverified 0CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning Sep 21, 2023 speech-recognition Speech Recognition
Code Code Available 0Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition Sep 21, 2023 speech-recognition Speech Recognition
Code Code Available 1Sparsely Shared LoRA on Whisper for Child Speech Recognition Sep 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition Sep 20, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition Sep 20, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Fine-Tuning Self-Supervised Learning Models for End-to-End Pronunciation Scoring Sep 19, 2023 Feature Engineering Phone-level pronunciation scoring
Code Code Available 1Exploring Speech Enhancement for Low-resource Speech Synthesis Sep 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement Sep 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition Sep 19, 2023 Language Modeling Language Modelling
— Unverified 0Semi-Autoregressive Streaming ASR With Label Context Sep 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition Sep 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0End-to-End Speech Recognition Contextualization with Large Language Models Sep 19, 2023 Decoder Language Modeling
— Unverified 0HTEC: Human Transcription Error Correction Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Training dynamic models using early exits for automatic speech recognition on resource-constrained devices Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter Sep 18, 2023 parameter-efficient fine-tuning speech-recognition
— Unverified 0Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation Sep 18, 2023 Automatic Speech Recognition Knowledge Distillation
— Unverified 0Instruction-Following Speech Recognition Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Investigating End-to-End ASR Architectures for Long Form Audio Transcription Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0HypR: A comprehensive study for ASR hypothesis revising with a reference corpus Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting Sep 18, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning Sep 17, 2023 Automatic Speech Recognition Denoising
— Unverified 0Enhancing Quantised End-to-End ASR Models via Personalisation Sep 17, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Improving Speech Recognition for African American English With Audio Classification Sep 16, 2023 Audio Classification Automatic Speech Recognition
— Unverified 0Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation Sep 16, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints Sep 16, 2023 Attribute Automatic Speech Recognition
— Unverified 0Transformer Based Punctuation Restoration for Turkish Sep 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction Sep 15, 2023 Audio-Visual Speech Recognition speech-recognition
— Unverified 0t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability Sep 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0