Speech ReaLLM -- Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time Jun 13, 2024 Decoder speech-recognition
— Unverified 0Multi-Modal Retrieval For Large Language Model Based Speech Recognition Jun 13, 2024 Automatic Speech Recognition Language Modeling
— Unverified 0AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers Jun 13, 2024 speech-recognition Speech Recognition
— Unverified 0Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition Jun 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks Jun 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments Jun 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech Jun 13, 2024 Language Identification speaker-diarization
— Unverified 0Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't Jun 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Improving child speech recognition with augmented child-like speech Jun 12, 2024 speech-recognition Speech Recognition
— Unverified 0Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness Jun 12, 2024 Action Detection Activity Detection
— Unverified 0ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Refining Self-Supervised Learnt Speech Representation using Brain Activations Jun 12, 2024 Automatic Speech Recognition Speaker Verification
— Unverified 0DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR Jun 12, 2024 Automatic Speech Recognition Decoder
— Unverified 0Towards Unsupervised Speech Recognition Without Pronunciation Models Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Transformer-based Model for ASR N-Best Rescoring and Rewriting Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models Jun 12, 2024 Language Modeling Language Modelling
— Unverified 0Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding Jun 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Neural Blind Source Separation and Diarization for Distant Speech Recognition Jun 12, 2024 blind source separation Distant Speech Recognition
— Unverified 0Tag and correct: high precision post-editing approach to correction of speech recognition errors Jun 11, 2024 Automatic Speech Recognition speech-recognition
— Unverified 0Spoken Language Corpora Augmentation with Domain-Specific Voice-Cloned Speech Jun 11, 2024 speech-recognition Speech Recognition
— Unverified 0Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter Jun 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection Jun 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Reading Miscue Detection in Primary School through Automatic Speech Recognition Jun 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ASTRA: Aligning Speech and Text Representations for Asr without Sampling Jun 10, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Synthetic Query Generation using Large Language Models for Virtual Assistants Jun 10, 2024 Information Retrieval speech-recognition
— Unverified 0Prompting Large Language Models with Audio for General-Purpose Speech Summarization Jun 10, 2024 speech-recognition Speech Recognition
Code Code Available 1A Parameter-efficient Language Extension Framework for Multilingual ASR Jun 10, 2024 Continual Learning parameter-efficient fine-tuning
— Unverified 0Label-Looping: Highly Efficient Decoding for Transducers Jun 10, 2024 GPU speech-recognition
— Unverified 0MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations Jun 9, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper Jun 9, 2024 speech-recognition Speech Recognition
Code Code Available 0Optimizing Multi-Stuttered Speech Classification: Leveraging Whisper's Encoder for Efficient Parameter Reduction in Automated Assessment Jun 9, 2024 Multi-Label Classification MUlTI-LABEL-ClASSIFICATION
— Unverified 0LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR Jun 7, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0LLM-based speaker diarization correction: A generalizable approach Jun 7, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis Jun 7, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Flexible Multichannel Speech Enhancement for Noise-Robust Frontend Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation Jun 6, 2024 es-en speech-recognition
Code Code Available 1To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Hypernetworks for Personalizing ASR to Atypical Speech Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0BLSP-Emo: Towards Empathetic Large Speech-Language Models Jun 6, 2024 Emotion Recognition Instruction Following
Code Code Available 2Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU Jun 6, 2024 GPU speech-recognition
— Unverified 0Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement Jun 6, 2024 Diversity Speech Enhancement
Code Code Available 1Helsinki Speech Challenge 2024 Jun 6, 2024 Speech Enhancement speech-recognition
— Unverified 0LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores Jun 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Joint Beam Search Integrating CTC, Attention, and Transducer Decoders Jun 5, 2024 Automatic Speech Recognition Decoder
— Unverified 0Error-preserving Automatic Speech Recognition of Young English Learners' Language Jun 5, 2024 Automatic Speech Recognition Language Modelling
Code Code Available 0