Open Implementation and Study of BEST-RQ for Speech Processing May 7, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition May 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Whispy: Adapting STT Whisper Models to Real-Time Environments May 6, 2024 Action Detection Activity Detection
— Unverified 0Mixat: A Data Set of Bilingual Emirati-English Speech May 4, 2024 speech-recognition Speech Recognition
Code Code Available 0Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition May 3, 2024 Active Learning Automatic Speech Recognition
— Unverified 0Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets May 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Sequence-to-sequence models in peer-to-peer learning: A practical application May 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Improving Membership Inference in ASR Model Auditing with Perturbed Loss Features May 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment May 2, 2024 GPU NVIDIA Jetson Orin Nano
Code Code Available 0Efficient Compression of Multitask Multilingual Speech Models May 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Low-resource speech recognition and dialect identification of Irish in a multi-task framework May 2, 2024 Decoder Dialect Identification
— Unverified 0Efficient Sample-Specific Encoder Perturbations May 1, 2024 Attribute Decoder
— Unverified 0Active Learning with Task Adaptation Pre-training for Speech Emotion Recognition May 1, 2024 Active Learning Emotion Recognition
Code Code Available 0Does Whisper understand Swiss German? An automatic, qualitative, and human evaluation Apr 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system Apr 29, 2024 speech-recognition Speech Recognition
— Unverified 0Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification Apr 29, 2024 Classification Gender Classification
— Unverified 0Child Speech Recognition in Human-Robot Interaction: Problem Solved? Apr 26, 2024 GPU speech-recognition
— Unverified 0Automatic Speech Recognition System-Independent Word Error Rate Estimation Apr 25, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF Apr 25, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Developing Acoustic Models for Automatic Speech Recognition in Swedish Apr 25, 2024 Automatic Speech Recognition speech-recognition
— Unverified 0Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices Apr 24, 2024 Automatic Speech Recognition CPU
— Unverified 0Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges Apr 24, 2024 Drug Design Inductive Bias
Code Code Available 2Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm Apr 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance Apr 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic Information Apr 23, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Less Peaky and More Accurate CTC Forced Alignment by Label Priors Apr 22, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks Apr 22, 2024 speech-recognition Speech Recognition
Code Code Available 0Semantically Corrected Amharic Automatic Speech Recognition Apr 20, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Efficient infusion of self-supervised representations in Automatic Speech Recognition Apr 19, 2024 Automatic Speech Recognition Decoder
— Unverified 0Learn2Talk: 3D Talking Face Learns from 2D Talking Face Apr 19, 2024 Audio-Visual Speech Recognition speech-recognition
— Unverified 0Artificial Neural Networks to Recognize Speakers Division from Continuous Bengali Speech Apr 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training Apr 16, 2024 Language Modeling Language Modelling
Code Code Available 0Anatomy of Industrial Scale Multilingual ASR Apr 15, 2024 Anatomy Automatic Speech Recognition
— Unverified 0Resilience of Large Language Models for Noisy Instructions Apr 15, 2024 Automatic Speech Recognition Optical Character Recognition
— Unverified 0Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task Apr 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Automatic Speech Recognition Advancements for Indigenous Languages of the Americas Apr 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution Apr 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping Apr 10, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0An inclusive review on deep learning techniques and their scope in handwriting recognition Apr 10, 2024 Deep Learning Handwriting Recognition
— Unverified 0The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge Apr 9, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain Apr 8, 2024 Language Modelling Speech Recognition
— Unverified 0Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition Apr 4, 2024 Automatic Speech Recognition Decoder
— Unverified 0CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models Apr 3, 2024 Optical Character Recognition (OCR) speech-recognition
Code Code Available 1Mai Ho'omāuna i ka 'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian Apr 3, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Kallaama: A Transcribed Speech Dataset about Agriculture in the Three Most Widely Spoken Languages in Senegal Apr 2, 2024 Automatic Speech Recognition speech-recognition
Code Code Available 1Transfer Learning from Whisper for Microscopic Intelligibility Prediction Apr 2, 2024 Automatic Speech Recognition Deep Learning
— Unverified 0Noise Masking Attacks and Defenses for Pretrained Speech Models Apr 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition Apr 2, 2024 speech-recognition Speech Recognition
Code Code Available 2Houston we have a Divergence: A Subgroup Performance Analysis of ASR Models Mar 31, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models Mar 29, 2024 Automatic Speech Recognition speech-recognition
Code Code Available 0