| Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech | Sep 21, 2023 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data | Apr 20, 2021 | Decodertext-to-speech | CodeCode Available | 1 | 5 |
| Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models | May 21, 2025 | Bayesian OptimizationSpeech Synthesis | CodeCode Available | 1 | 5 |
| PRESENT: Zero-Shot Text-to-Prosody Control | Aug 13, 2024 | Prosody PredictionSpeech Synthesis | CodeCode Available | 1 | 5 |
| Pretraining Techniques for Sequence-to-Sequence Voice Conversion | Aug 7, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus | Jul 7, 2022 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech | Jan 1, 2021 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| BiSinger: Bilingual Singing Voice Synthesis | Sep 25, 2023 | Singing Voice Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation | Nov 1, 2020 | Dynamic Time WarpingMachine Translation | CodeCode Available | 1 | 5 |
| Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech | Feb 27, 2023 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech | Jun 28, 2023 | Emotion RecognitionSpeech Synthesis | CodeCode Available | 1 | 5 |
| EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels | May 22, 2023 | Expressive Speech SynthesisSpeech Synthesis | CodeCode Available | 1 | 5 |
| Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech | Oct 1, 2023 | speech-recognitionSpeech Recognition | CodeCode Available | 1 | 5 |
| ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech | Dec 30, 2022 | Denoisingtext-to-speech | CodeCode Available | 1 | 5 |
| Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention | Oct 24, 2017 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech | Mar 31, 2022 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| Effective Deep Learning Models for Automatic Diacritization of Arabic Text | Nov 1, 2020 | Arabic Text DiacritizationDecoder | CodeCode Available | 1 | 5 |
| Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding | Aug 12, 2020 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text | Apr 3, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset | Apr 17, 2021 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| Bts-e: Audio deepfake detection using breathing-talking-silence encoder | May 5, 2023 | Audio Deepfake DetectionDeepFake Detection | CodeCode Available | 1 | 5 |
| KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis | Apr 1, 2024 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| EdiTTS: Score-based Editing for Controllable Text-to-Speech | Oct 6, 2021 | Speech SynthesisSpeech-to-Text | CodeCode Available | 1 | 5 |
| ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features | Aug 3, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Phonological Features for 0-shot Multilingual Speech Synthesis | Aug 6, 2020 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| EfficientSpeech: An On-Device Text to Speech Model | May 23, 2023 | CPUmodel | CodeCode Available | 1 | 5 |
| ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts | Feb 8, 2025 | BenchmarkingSelf-Supervised Learning | CodeCode Available | 1 | 5 |
| Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training | Mar 31, 2021 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS | Jun 26, 2024 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| Attention model for articulatory features detection | Jul 2, 2019 | Manner Of Articulation Detectionmodel | CodeCode Available | 1 | 5 |
| ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation | May 29, 2023 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion | Jul 4, 2021 | text-to-speechText to Speech | CodeCode Available | 1 | 5 |
| MathReader : Text-to-Speech for Mathematical Documents | Jan 13, 2025 | Optical Character Recognition (OCR)text-to-speech | CodeCode Available | 1 | 5 |
| TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models | Aug 28, 2023 | Language Modellingtext-to-speech | CodeCode Available | 1 | 5 |
| Dreamento: an open-source dream engineering toolbox for sleep EEG wearables | Jul 8, 2022 | EEGElectroencephalogram (EEG) | CodeCode Available | 1 | 5 |
| Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation | May 18, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data | May 18, 2023 | Speech EnhancementSpeech Synthesis | CodeCode Available | 1 | 5 |
| Parameter-Efficient Learning for Text-to-Speech Accent Adaptation | May 18, 2023 | Decodertext-to-speech | CodeCode Available | 1 | 5 |
| DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training | Jul 31, 2023 | DenoisingExpressive Speech Synthesis | CodeCode Available | 1 | 5 |
| Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations | Mar 3, 2023 | Speech DenoisingSpeech Enhancement | CodeCode Available | 1 | 5 |
| Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech | Jun 5, 2022 | Polyphone disambiguationtext-to-speech | CodeCode Available | 1 | 5 |
| Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech | Nov 7, 2021 | Meta-LearningSpeech Synthesis | CodeCode Available | 1 | 5 |
| Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0 | Mar 14, 2020 | ClusteringRepresentation Learning | CodeCode Available | 1 | 5 |
| Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling | Oct 8, 2020 | Speech Recognitiontext-to-speech | CodeCode Available | 1 | 5 |
| One-class learning towards generalized voice spoofing detection | Oct 27, 2020 | Speaker Verificationtext-to-speech | CodeCode Available | 1 | 5 |
| Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings | Oct 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| A Survey on Neural Speech Synthesis | Jun 29, 2021 | Speech SynthesisSurvey | CodeCode Available | 1 | 5 |
| MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline | Sep 22, 2022 | Speech Synthesistext-to-speech | CodeCode Available | 1 | 5 |
| Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism | Dec 11, 2023 | Face GenerationLip Reading | CodeCode Available | 1 | 5 |
| One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech | Aug 3, 2020 | Meta-LearningSpeech Synthesis | CodeCode Available | 1 | 5 |