Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture Jul 5, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Transcribing Educational Videos Using Whisper: A preliminary study on using AI for transcribing educational videos Jul 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data Jul 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework Jul 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Boosting Norwegian Automatic Speech Recognition Jul 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems Jul 3, 2023 Cross-Lingual Transfer named-entity-recognition
Code Code Available 0Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages Jul 3, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters Jul 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Conformer LLMs -- Convolution Augmented Large Language Models Jul 2, 2023 Automatic Speech Recognition Language Modeling
— Unverified 0Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion Jul 1, 2023 speech-recognition Speech Recognition
Code Code Available 1Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings Jun 30, 2023 Audio Classification speech-recognition
Code Code Available 1Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications Jun 29, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT Jun 29, 2023 Automatic Lyrics Transcription Language Modeling
Code Code Available 1Leveraging Cross-Utterance Context For ASR Decoding Jun 29, 2023 speech-recognition Speech Recognition
— Unverified 0Long-term Conversation Analysis: Exploring Utility and Privacy Jun 28, 2023 Action Detection Activity Detection
Code Code Available 0Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition Jun 28, 2023 Decoder Domain Adaptation
— Unverified 0Accelerating Transducers through Adjacent Token Merging Jun 28, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms Jun 27, 2023 Deep Learning GPU
— Unverified 0Confidence-based Ensembles of End-to-End Speech Recognition Models Jun 27, 2023 Language Identification Model Selection
— Unverified 0Scaling Laws for Discriminative Speech Recognition Rescoring Models Jun 27, 2023 speech-recognition Speech Recognition
— Unverified 0Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation Jun 27, 2023 Knowledge Distillation speech-recognition
— Unverified 0Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition Jun 27, 2023 Domain Adaptation speech-recognition
— Unverified 0Large-scale unsupervised audio pre-training for video-to-speech synthesis Jun 27, 2023 speech-recognition Speech Recognition
— Unverified 0Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems Jun 26, 2023 Diversity speech-recognition
— Unverified 0Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning Jun 23, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios Jun 23, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems Jun 23, 2023 speech-recognition Speech Recognition
— Unverified 0Meta-Gating Framework for Fast and Continuous Resource Optimization in Dynamic Wireless Environments Jun 23, 2023 image-classification Image Classification
— Unverified 0AudioPaLM: A Large Language Model That Can Speak and Listen Jun 22, 2023 Language Modeling Language Modelling
— Unverified 0Exploring the Role of Audio in Video Captioning Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection Jun 21, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0Federated Self-Learning with Weak Supervision for Speech Recognition Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Learning When to Trust Which Teacher for Weakly Supervised ASR Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Mixture Encoder for Joint Speech Separation and Recognition Jun 21, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Quilt-1M: One Million Image-Text Pairs for Histopathology Jun 20, 2023 Automatic Speech Recognition Cross-Modal Retrieval
Code Code Available 1HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation Jun 20, 2023 Cross-corpus Sentence
Code Code Available 0Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition Jun 20, 2023 Accented Speech Recognition speech-recognition
— Unverified 0Rehearsal-Free Online Continual Learning for Automatic Speech Recognition Jun 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model Jun 18, 2023 Data Augmentation Decoder
Code Code Available 1SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition Jun 18, 2023 Decoder Domain Adaptation
Code Code Available 0Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition Jun 18, 2023 Audio-Visual Speech Recognition speech-recognition
Code Code Available 1MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition Jun 18, 2023 Audio-Visual Speech Recognition Representation Learning
Code Code Available 1STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization Jun 18, 2023 All Graph Learning
Code Code Available 1MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones Jun 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Distillation Strategies for Discriminative Speech Recognition Rescoring Jun 15, 2023 Language Modeling Language Modelling
— Unverified 0Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation Jun 15, 2023 Automatic Speech Recognition Clustering
Code Code Available 1Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction Jun 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0EM-Network: Oracle Guided Self-distillation for Sequence Learning Jun 14, 2023 Decoder Machine Translation
— Unverified 0