SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 101125 of 1419 papers

TitleStatusHype
Audio Deepfake Detection with Self-Supervised XLS-R and SLS ClassifierCode2
Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality AlignmentCode2
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style CaptioningCode2
TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-InstrumentCode2
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech SynthesisCode2
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise DistillationCode2
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech SynthesisCode2
ÌròyìnSpeech: A multi-purpose Yorùbá Speech CorpusCode1
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to SpeechCode1
Improving fairness for spoken language understanding in atypical speech with Text-to-SpeechCode1
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine TranslationCode1
Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer LearningCode1
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis DatasetCode1
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice GenerationCode1
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methodsCode1
Grad-TTS: A Diffusion Probabilistic Model for Text-to-SpeechCode1
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An OverviewCode1
GUIRoboTron-Speech: Towards Automated GUI Agents Based on Speech InstructionsCode1
HUI-Audio-Corpus-German: A high quality TTS datasetCode1
Imaginary Voice: Face-styled Diffusion Model for Text-to-SpeechCode1
g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark DatasetCode1
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word SegmentationCode1
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited DataCode1
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech SystemsCode1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
Show:102550
← PrevPage 5 of 57Next →

No leaderboard results yet.