Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 1419 papers

Title	Date	Tasks	Status	Hype
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling	Oct 14, 2023	Speech Synthesistext-to-speech	CodeCode Available	2
Crowdsourced and Automatic Speech Prominence Estimation	Oct 12, 2023	Emotion Recognitiontext-to-speech	CodeCode Available	1
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition	Oct 12, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Prosody Analysis of Audiobooks	Oct 10, 2023	AttributeLanguage Modeling	CodeCode Available	0
Neutral TTS Female Voice Corpus in Brazilian Portuguese	Oct 8, 2023	Speech Synthesistext-to-speech	—Unverified	0
Unified speech and gesture synthesis using flow matching	Oct 8, 2023	Audio SynthesisMotion Synthesis	—Unverified	0
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset	Oct 8, 2023	text-to-speechText to Speech	—Unverified	0
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT	Oct 7, 2023	Audio captioningAutomatic Speech Recognition	CodeCode Available	2
Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis	Oct 5, 2023	Data AugmentationSpeech Synthesis	—Unverified	0
The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains	Oct 4, 2023	Speech Synthesistext-to-speech	—Unverified	0
Towards human-like spoken dialogue generation between AI agents from written dialogue	Oct 2, 2023	Dialogue Generationtext-to-speech	—Unverified	0
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech	Oct 1, 2023	speech-recognitionSpeech Recognition	CodeCode Available	1
Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features	Sep 29, 2023	Synthetic Speech Detectiontext-to-speech	—Unverified	0
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS	Sep 29, 2023	Self-Supervised Learningtext-to-speech	—Unverified	0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models	Sep 27, 2023	AllSpeech Synthesis	—Unverified	0
Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping	Sep 25, 2023	Speech Synthesistext-to-speech	—Unverified	0
BiSinger: Bilingual Singing Voice Synthesis	Sep 25, 2023	Singing Voice Synthesistext-to-speech	CodeCode Available	1
VoiceLDM: Text-to-Speech with Environmental Context	Sep 24, 2023	AudioCapstext-to-speech	—Unverified	0
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis	Sep 22, 2023	DenoisingSpeech Synthesis	—Unverified	0
Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech	Sep 21, 2023	text-to-speechText to Speech	CodeCode Available	1
The Impact of Silence on Speech Anti-Spoofing	Sep 21, 2023	Action DetectionActivity Detection	—Unverified	0
Speak While You Think: Streaming Speech Synthesis During Text Generation	Sep 20, 2023	Speech SynthesisText Generation	—Unverified	0
Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model	Sep 20, 2023	ChatbotLanguage Modeling	CodeCode Available	1
Exploring Speech Enhancement for Low-resource Speech Synthesis	Sep 19, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition	Sep 19, 2023	Data AugmentationEmotion Recognition	—Unverified	0
Augmenting text for spoken language understanding with Large Language Models	Sep 17, 2023	Semantic ParsingSpoken Language Understanding	—Unverified	0
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods	Sep 15, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions	Sep 15, 2023	text-to-speechText to Speech	—Unverified	0
Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech	Sep 15, 2023	Knowledge DistillationSpeech Synthesis	—Unverified	0
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec	Sep 14, 2023	Automatic Speech Recognitionspeech-recognition	CodeCode Available	2
Direct Text to Speech Translation System using Acoustic Units	Sep 14, 2023	DecoderSpeech-to-Speech Translation	—Unverified	0
Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP	Sep 11, 2023	text-to-speechText to Speech	CodeCode Available	1
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching	Sep 10, 2023	text-to-speechText to Speech	CodeCode Available	2
Cross-Utterance Conditioned VAE for Speech Generation	Sep 8, 2023	Speech Synthesistext-to-speech	—Unverified	0
Large-Scale Automatic Audiobook Creation	Sep 7, 2023	text-to-speechText to Speech	—Unverified	0
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023	Sep 6, 2023	Speech Synthesistext-to-speech	—Unverified	0
GRASS: Unified Generation Model for Speech-to-Semantic Tasks	Sep 6, 2023	named-entity-recognitionNamed Entity Recognition	—Unverified	0
PromptTTS 2: Describing and Generating Voices with Text Prompt	Sep 5, 2023	Language ModellingLarge Language Model	—Unverified	0
A Comparative Analysis of Pretrained Language Models for Text-to-Speech	Sep 4, 2023	Natural Language UnderstandingPrediction	—Unverified	0
The FruitShell French synthesis system at the Blizzard 2023 Challenge	Sep 1, 2023	Data AugmentationSpeech Synthesis	—Unverified	0
Learning Speech Representation From Contrastive Token-Acoustic Pretraining	Sep 1, 2023	Audio ClassificationAutomatic Speech Recognition	—Unverified	0
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning	Aug 31, 2023	Representation LearningSpeech Representation Learning	CodeCode Available	1
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models	Aug 31, 2023	DecoderLanguage Modeling	CodeCode Available	2
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis	Aug 31, 2023	Expressive Speech SynthesisSentence	—Unverified	0
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information	Aug 31, 2023	DecoderMulti-Task Learning	—Unverified	0
The DeepZen Speech Synthesis System for Blizzard Challenge 2023	Aug 30, 2023	SentenceSpeech Synthesis	—Unverified	0
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech	Aug 28, 2023	Domain Generalizationtext-to-speech	—Unverified	0
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models	Aug 28, 2023	Language Modellingtext-to-speech	CodeCode Available	1
Rep2wav: Noise Robust text-to-speech Using self-supervised representations	Aug 28, 2023	Speech Enhancementtext-to-speech	—Unverified	0
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations	Aug 24, 2023	Representation LearningSpeech Synthesis	—Unverified	0

Show:10 25 50

← PrevPage 11 of 29Next →

No leaderboard results yet.