Transformer-Based Named Entity Recognition for Automated Server Provisioning Apr 1, 2025 named-entity-recognition Named Entity Recognition
Code Code Available 0Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems Apr 1, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Scaling Auditory Cognition via Test-Time Compute in Audio Language Models Mar 30, 2025 speech-recognition Speech Recognition
— Unverified 0The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR Mar 30, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0VALLR: Visual ASR Language Model for Lip Reading Mar 27, 2025 Automatic Speech Recognition Language Modeling
— Unverified 0A 71.2-μW Speech Recognition Accelerator with Recurrent Spiking Neural Network Mar 27, 2025 Quantization speech-recognition
— Unverified 0Efficient First-Order Optimization on the Pareto Set for Multi-Objective Learning under Preference Guidance Mar 26, 2025 Bilevel Optimization Fairness
— Unverified 0FinAudio: A Benchmark for Audio Large Language Models in Financial Applications Mar 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit Mar 26, 2025 speech-recognition Speech Recognition
— Unverified 0Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation Optimization Mar 25, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Contextual Metric Meta-Evaluation by Measuring Local Metric Accuracy Mar 25, 2025 Benchmarking speech-recognition
— Unverified 0Coverage-Guaranteed Speech Emotion Recognition via Calibrated Uncertainty-Adaptive Prediction Sets Mar 24, 2025 Conformal Prediction Emotion Recognition
— Unverified 0Whispering in Amharic: Fine-tuning Whisper for Low-resource Language Mar 24, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0From S4 to Mamba: A Comprehensive Survey on Structured State Space Models Mar 22, 2025 Computational Efficiency Mamba
— Unverified 0Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication Mar 21, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Mar 20, 2025 speaker-diarization Speaker Diarization
— Unverified 0A Comprehensive Survey on Architectural Advances in Deep CNNs: Challenges, Applications, and Emerging Research Directions Mar 19, 2025 Action Recognition Computational Efficiency
— Unverified 0Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces Mar 19, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Halving transcription time: A fast, user-friendly and GDPR-compliant workflow to create AI-assisted transcripts for content analysis Mar 17, 2025 Automatic Speech Recognition speech-recognition
— Unverified 0Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA Mar 13, 2025 Automatic Speech Recognition parameter-efficient fine-tuning
— Unverified 0ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization Mar 12, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Proceedings of the ISCA/ITG Workshop on Diversity in Large Speech and Language Models Mar 12, 2025 Diversity General Knowledge
— Unverified 0Quantization for OpenAI's Whisper Models: A Comparative Analysis Mar 12, 2025 Quantization speech-recognition
Code Code Available 0Everything Can Be Described in Words: A Simple Unified Multi-Modal Framework with Semantic and Temporal Alignment Mar 12, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Lend a Hand: Semi Training-Free Cued Speech Recognition via MLLM-Driven Hand Modeling for Barrier-free Communication Mar 11, 2025 Lip Reading Prompt Engineering
Code Code Available 0An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR Mar 11, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Automatic Speech Recognition for Non-Native English: Accuracy and Disfluency Handling Mar 10, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Building English ASR model with regional language support Mar 10, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs Mar 9, 2025 Audio-Visual Speech Recognition Computational Efficiency
— Unverified 0A Causal Inference Approach for Quantifying Research Impact Mar 7, 2025 Causal Inference counterfactual
— Unverified 0Self-Supervised Models for Phoneme Recognition: Applications in Children's Speech for Reading Learning Mar 6, 2025 Phoneme Recognition Self-Supervised Learning
— Unverified 0From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment Mar 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Qieemo: Speech Is All You Need in the Emotion Recognition in Conversations Mar 5, 2025 All Automatic Speech Recognition
— Unverified 0CORDIC Is All You Need Mar 4, 2025 All speech-recognition
— Unverified 0Fine-Tuning Whisper for Inclusive Prosodic Stress Analysis Mar 3, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Direct Speech to Speech Translation: A Review Mar 3, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation Mar 2, 2025 Decoder Representation Learning
— Unverified 0Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems Mar 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Adapting Automatic Speech Recognition for Accented Air Traffic Control Communications Feb 27, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition Feb 26, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision Feb 26, 2025 Audio Synthesis Automatic Speech Recognition
— Unverified 0Exploring Gender Disparities in Automatic Speech Recognition Technology Feb 25, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus Feb 24, 2025 Automatic Speech Recognition (ASR) speech-recognition
Code Code Available 0Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM Feb 24, 2025 Automatic Speech Recognition Language Modeling
— Unverified 0Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation Feb 24, 2025 Automatic Speech Recognition Diversity
— Unverified 0Understanding Zero-shot Rare Word Recognition Improvements Through LLM Integration Feb 22, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages Feb 21, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Retrieval-Augmented Speech Recognition Approach for Domain Challenges Feb 21, 2025 Decoder RAG
— Unverified 0Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders Feb 21, 2025 Audio captioning Automatic Speech Recognition
— Unverified 0WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models Feb 20, 2025 Automatic Speech Recognition RAG
— Unverified 0