Keyword spotting -- Detecting commands in speech using deep learning Dec 9, 2023 Deep Learning Feature Engineering
— Unverified 0A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing Dec 9, 2023 Deep Learning Language Modeling
— Unverified 0FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning Dec 7, 2023 Federated Learning image-classification
— Unverified 0Graph Convolutions Enrich the Self-Attention in Transformers! Dec 7, 2023 Clone Detection
Code Code Available 1Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation Dec 6, 2023 Cross-Lingual Transfer Phoneme Recognition
— Unverified 0Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models Dec 6, 2023 Automatic Speech Recognition Decoder
— Unverified 0Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition Dec 6, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features Dec 5, 2023 cross-modal alignment Decoder
— Unverified 0Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training Dec 3, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0End-to-End Speech-to-Text Translation: A Survey Dec 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Self Generated Wargame AI: Double Layer Agent Task Planning Based on Large Language Model Dec 2, 2023 Decision Making Language Modeling
— Unverified 0Mavericks at NADI 2023 Shared Task: Unravelling Regional Nuances through Dialect Identification using Transformer-based Approach Nov 30, 2023 Dialect Identification Multi-class Classification
— Unverified 0Speech Understanding on Tiny Devices with A Learning Cache Nov 30, 2023 speech-recognition Speech Recognition
Code Code Available 0Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets Nov 29, 2023 speech-recognition Speech Recognition
— Unverified 0End-to-end Joint Punctuated and Normalized ASR with a Limited Amount of Punctuated Training Data Nov 29, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0D4AM: A General Denoising Framework for Downstream Acoustic Models Nov 28, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors Nov 27, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Phonetic-aware speaker embedding for far-field speaker verification Nov 27, 2023 Speaker Recognition Speaker Verification
— Unverified 0Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching Nov 25, 2023 Language Modeling Language Modelling
— Unverified 0Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR Nov 24, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0Do VSR Models Generalize Beyond LRS3? Nov 23, 2023 Lip Reading speech-recognition
Code Code Available 1Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection Nov 21, 2023 image-classification Image Classification
Code Code Available 0Analysis of Visual Features for Continuous Lipreading in Spanish Nov 21, 2023 Lipreading speech-recognition
— Unverified 0Soft Random Sampling: A Theoretical and Empirical Analysis Nov 21, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish Nov 21, 2023 speech-recognition Speech Recognition
— Unverified 0LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild Nov 21, 2023 Automatic Speech Recognition speech-recognition
Code Code Available 0App for Resume-Based Job Matching with Speech Interviews and Grammar Analysis: A Review Nov 20, 2023 Automatic Speech Recognition speech-recognition
— Unverified 0How does end-to-end speech recognition training impact speech enhancement artifacts? Nov 20, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems Nov 20, 2023 Autonomous Driving Autonomous Vehicles
— Unverified 0Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition Nov 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding Nov 19, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System Nov 17, 2023 Decoder Privacy Preserving
— Unverified 0Investigating the Emergent Audio Classification Ability of ASR Foundation Models Nov 15, 2023 Audio Classification Decoder
Code Code Available 0Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer Nov 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Multi-channel Conversational Speaker Separation via Neural Diarization Nov 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Enhanced Generative Adversarial Networks for Unseen Word Generation from EEG Signals Nov 14, 2023 Brain Computer Interface Data Augmentation
— Unverified 0Retrieve and Copy: Scaling ASR Personalization to Large Catalogs Nov 14, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models Nov 14, 2023 Acoustic Scene Classification Audio captioning
Code Code Available 3Zero-shot audio captioning with audio-language model guidance and audio context keywords Nov 14, 2023 Audio captioning Descriptive
Code Code Available 1On the Effectiveness of ASR Representations in Real-world Noisy Speech Emotion Recognition Nov 13, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0ChatGPT in the context of precision agriculture data analytics Nov 10, 2023 Language Modelling speech-recognition
Code Code Available 0Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation Nov 9, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Whisper in Focus: Enhancing Stuttered Speech Classification with Encoder Layer Optimization Nov 9, 2023 speech-recognition Speech Recognition
— Unverified 0Towards End-to-End Spoken Grammatical Error Correction Nov 9, 2023 Grammatical Error Correction speech-recognition
— Unverified 0GPU-Accelerated WFST Beam Search Decoder for CTC-based Speech Recognition Nov 8, 2023 CPU Decoder
Code Code Available 11SPU: 1-step Speech Processing Unit Nov 8, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition Nov 7, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning Nov 7, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Fine-tuning convergence model in Bengali speech recognition Nov 7, 2023 Automatic Speech Recognition model
— Unverified 0Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition Nov 6, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0