SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 351400 of 1419 papers

TitleStatusHype
Integrated Speech and Gesture SynthesisCode0
High Fidelity Speech Synthesis with Adversarial NetworksCode0
Hierarchical Generative Modeling for Controllable Speech SynthesisCode0
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency GenerationCode0
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue ImagingCode0
Generating Synthetic Speech from SpokenVocab for Speech TranslationCode0
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogramCode0
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech RecognitionCode0
FPETS : Fully Parallel End-to-End Text-to-Speech SystemCode0
Continuous Speech Tokenizer in Text To SpeechCode0
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit AlignmentCode0
Few-Shot Speech Deepfake Detection Adaptation with Gaussian ProcessesCode0
Facial Landmark Predictions with Applications to MetaverseCode0
fairseq S^2: A Scalable and Integrable Speech Synthesis ToolkitCode0
Expediting TTS Synthesis with Adversarial VocodingCode0
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech ToolkitCode0
Exploring TTS without T Using Biologically/Psychologically Motivated Neural Network Modules (ZeroSpeech 2020)Code0
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech SynthesisCode0
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue ImagingCode0
Emotional Voice Conversion using Multitask Learning with Text-to-speechCode0
EmoNews: A Spoken Dialogue System for Expressive News ConversationsCode0
ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworksCode0
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech GenerationCode0
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context ModelingCode0
AI4D -- African Language ProgramCode0
ECAPA-TDNN for Multi-speaker Text-to-speech SynthesisCode0
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systemsCode0
ClonEval: An Open Voice Cloning BenchmarkCode0
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet VocoderCode0
FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody ConsistencyCode0
Learning High-Frequency Functions Made Easy with Sinusoidal Positional EncodingCode0
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding0
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus0
ArmanTTS single-speaker Persian dataset0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
A Review of Multi-Modal Large Language and Vision Models0
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer0
CHULA TTS: A Modularized Text-To-Speech Framework0
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network0
A Review of Deep Learning Techniques for Speech Processing0
ChatAnything: Facetime Chat with LLM-Enhanced Personas0
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment0
A review-based study on different Text-to-Speech technologies0
A Generative Model of a Pronunciation Lexicon for Hindi0
A Cost Efficient Approach to Correct OCR Errors in Large Document Collections0
Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models0
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems0
CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface0
Arabic Text-To-Speech (TTS) Data Preparation0
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer0
Show:102550
← PrevPage 8 of 29Next →

No leaderboard results yet.