SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 13011350 of 1419 papers

TitleStatusHype
Towards Fully Automatic Annotation of Audio Books for TTS0
Towards human-like spoken dialogue generation between AI agents from written dialogue0
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement0
Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale0
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram0
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion0
Towards Optimizing OCR for Accessibility0
Towards Robust FastSpeech 2 by Modelling Residual Multimodality0
Towards Robust Neural Vocoding for Speech Generation: A Survey0
Prosody Analysis of AudiobooksCode0
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech ToolkitCode0
Systematic Inequalities in Language Technology Performance across the World's LanguagesCode0
Systematic Inequalities in Language Technology Performance across the World’s LanguagesCode0
Learning High-Frequency Functions Made Easy with Sinusoidal Positional EncodingCode0
FPETS : Fully Parallel End-to-End Text-to-Speech SystemCode0
QSpeech: Low-Qubit Quantum Speech Application ToolkitCode0
PromptTTS: Controllable Text-to-Speech with Text DescriptionsCode0
FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody ConsistencyCode0
Pretrained Speech Encoders and Efficient Fine-tuning Methods for Speech Translation: UPC at IWSLT 2022Code0
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet VocoderCode0
Few-Shot Speech Deepfake Detection Adaptation with Gaussian ProcessesCode0
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context ModelingCode0
Direct speech-to-speech translation with a sequence-to-sequence modelCode0
Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic ForgettingCode0
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021Code0
Preparing an Endangered Language for the Digital Age: The Case of Judeo-SpanishCode0
fairseq S^2: A Scalable and Integrable Speech Synthesis ToolkitCode0
SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data SetCode0
SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural NetworkCode0
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic ProgrammingCode0
Emotional Voice Conversion using Multitask Learning with Text-to-speechCode0
JSSS: free Japanese speech corpus for summarization and simplificationCode0
"I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen EntitiesCode0
Towards Lifelong Learning of Multilingual Text-To-Speech SynthesisCode0
AI4D -- African Language ProgramCode0
A Fully Time-domain Neural Model for Subband-based Speech SynthesizerCode0
Predicting distributions with Linearizing Belief NetworksCode0
Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory InputCode0
Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised LearningCode0
Deep Voice: Real-time Neural Text-to-SpeechCode0
IsoChronoMeter: A simple and effective isochronic translation evaluation metricCode0
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence LearningCode0
EmoNews: A Spoken Dialogue System for Expressive News ConversationsCode0
When Is TTS Augmentation Through a Pivot Language Useful?Code0
Facial Landmark Predictions with Applications to MetaverseCode0
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue ImagingCode0
Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text ReportsCode0
Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case StudyCode0
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake DatasetCode0
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent languageCode0
Show:102550
← PrevPage 27 of 29Next →

No leaderboard results yet.