| An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space | Nov 6, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| An End-to-End Neural Network for Image-to-Audio Transformation | Mar 10, 2023 | Image to texttext-to-speech | —Unverified | 0 |
| A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music | Mar 4, 2021 | text-to-speechText to Speech | —Unverified | 0 |
| A New Approach to Voice Authenticity | Feb 9, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR | Mar 11, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis | Dec 8, 2023 | BenchmarkingQuantization | —Unverified | 0 |
| An Expert System for Automatic Reading of A Text Written in Standard Arabic | May 8, 2014 | Speech Synthesistext-to-speech | —Unverified | 0 |
| An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS | Jun 25, 2025 | Speaker Recognitiontext-to-speech | —Unverified | 0 |
| An Implementation of Back-Propagation Learning on GF11, a Large SIMD Parallel Computer | Jan 4, 2018 | Neural Network simulationtext-to-speech | —Unverified | 0 |
| An In-depth Analysis of the Effect of Text Normalization in Social Media | May 1, 2015 | Dependency Parsingnamed-entity-recognition | —Unverified | 0 |
| An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS | Jun 9, 2024 | DenoisingSpeech Denoising | —Unverified | 0 |
| An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis | Jun 3, 2021 | Speaker VerificationSpeech Synthesis | —Unverified | 0 |
| Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy | Oct 13, 2022 | Generative Adversarial NetworkSpeaker anonymization | —Unverified | 0 |
| A Novel Approach to OCR using Image Recognition based Classification for Ancient Tamil Inscriptions in Temples | Jul 4, 2019 | BinarizationGeneral Classification | —Unverified | 0 |
| A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation | Jun 10, 2022 | Machine Translationtext-to-speech | —Unverified | 0 |
| A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions | Jun 4, 2025 | Data AugmentationDiversity | —Unverified | 0 |
| An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era | Oct 6, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| An overview of text-to-speech systems and media applications | Oct 22, 2023 | Acoustic Modellingtext-to-speech | —Unverified | 0 |
| Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck | Apr 4, 2022 | Speaker Verificationtext-to-speech | —Unverified | 0 |
| AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person | Aug 9, 2021 | Talking Head Generationtext-to-speech | —Unverified | 0 |
| Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models | Nov 17, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| 基於字元階層之語音合成用文脈訊息擷取 (Character-Level Linguistic Features Extraction for Text-to-Speech System) [In Chinese] | Dec 1, 2016 | Feature EngineeringSpeech Synthesis | —Unverified | 0 |
| 基於字元階層之語音合成用文脈訊息擷取(Character-Level Linguistic Features Extraction for Text-to-Speech System) [In Chinese] | Oct 1, 2016 | text-to-speechText to Speech | —Unverified | 0 |
| A Polyphone BERT for Polyphone Disambiguation in Mandarin Chinese | Jul 1, 2022 | Polyphone disambiguationtext-to-speech | —Unverified | 0 |
| Application of ASV for Voice Identification after VC and Duration Predictor Improvement in TTS Models | Jun 27, 2024 | Speaker Verificationtext-to-speech | —Unverified | 0 |
| Applying Automated Machine Translation to Educational Video Courses | Jan 9, 2023 | Machine TranslationSpeech Synthesis | —Unverified | 0 |
| Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech | Apr 14, 2022 | Language Acquisitiontext-to-speech | —Unverified | 0 |
| Applying Syntaxx2013Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis | Mar 29, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| A Practical Guide to Logical Access Voice Presentation Attack Detection | Jan 10, 2022 | Artifact DetectionSpeaker Verification | —Unverified | 0 |
| A Preliminary Analysis of Automatic Word and Syllable Prominence Detection in Non-Native Speech With Text-to-Speech Prosody Embeddings | Dec 11, 2024 | text-to-speechText to Speech | —Unverified | 0 |
| A Proposal of Automatic Error Correction in Text | Sep 24, 2021 | Information RetrievalLanguage Modelling | —Unverified | 0 |
| Arabic Text-To-Speech (TTS) Data Preparation | Apr 7, 2022 | text-to-speechText to Speech | —Unverified | 0 |
| A review-based study on different Text-to-Speech technologies | Dec 17, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| A Review of Deep Learning Techniques for Speech Processing | Apr 30, 2023 | Automatic Speech RecognitionDeep Learning | —Unverified | 0 |
| A Review of Multi-Modal Large Language and Vision Models | Mar 28, 2024 | Image CaptioningPrompt Engineering | —Unverified | 0 |
| ArmanTTS single-speaker Persian dataset | Apr 7, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| Artificial Eye for the Blind | Jul 7, 2023 | Objectobject-detection | —Unverified | 0 |
| A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data | Jun 10, 2025 | text-to-speechText to Speech | —Unverified | 0 |
| A Simple Baseline for Domain Adaptation in End to End ASR Systems Using Synthetic Data | Jun 22, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Speech-enabled Fixed-phrase Translator for Healthcare Accessibility | Aug 1, 2021 | Machine Translationspeech-recognition | —Unverified | 0 |
| ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation | Jul 7, 2024 | Sentencetext-to-speech | —Unverified | 0 |
| AS-Speech: Adaptive Style For Speech Synthesis | Sep 9, 2024 | RhythmSpeech Synthesis | —Unverified | 0 |
| A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis | Aug 3, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 |
| A Study of Non-autoregressive Model for Sequence Generation | Apr 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Study on Altering the Latent Space of Pretrained Text to Speech Models for Improved Expressiveness | Nov 17, 2023 | text-to-speechText to Speech | —Unverified | 0 |
| A study on the efficacy of model pre-training in developing neural text-to-speech system | Oct 8, 2021 | Computational Efficiencytext-to-speech | —Unverified | 0 |
| A Survey on Audio Synthesis and Audio-Visual Multimodal Processing | Aug 1, 2021 | Audio SynthesisMusic Generation | —Unverified | 0 |
| ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech | Feb 13, 2025 | Adversarial AttackAdversarial Attack Detection | —Unverified | 0 |
| Asynchronous Tool Usage for Real-Time Agents | Oct 28, 2024 | Automatic Speech Recognitionspeech-recognition | —Unverified | 0 |
| A System for Diacritizing Four Varieties of Arabic | Nov 1, 2019 | Feature Engineeringtext-to-speech | —Unverified | 0 |