SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 301350 of 1419 papers

TitleStatusHype
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo ConversionCode0
An Open Source Web Reader for Under-Resourced LanguagesCode0
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker PrivacyCode0
ObamaNet: Photo-realistic lip-sync from textCode0
SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural NetworkCode0
Numbers Normalisation in the Inflected Languages: a Case Study of PolishCode0
Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic ForgettingCode0
An investigation of phrase break prediction in an End-to-End TTS systemCode0
BanglaFake: Constructing and Evaluating a Specialized Bengali Deepfake Audio DatasetCode0
Neural Voice Puppetry: Audio-driven Facial ReenactmentCode0
A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architectureCode0
Multimodal Latent Language Modeling with Next-Token DiffusionCode0
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-SpeechCode0
Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-SpeechCode0
MLS: A Large-Scale Multilingual Dataset for Speech ResearchCode0
Meta Learning Text-to-Speech Synthesis in over 7000 LanguagesCode0
MelNet: A Generative Model for Audio in the Frequency DomainCode0
Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State TransducersCode0
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence LearningCode0
Deep Voice 2: Multi-Speaker Neural Text-to-SpeechCode0
Luganda Text-to-Speech MachineCode0
Location-Relative Attention Mechanisms For Robust Long-Form Speech SynthesisCode0
Low-Resource Multilingual and Zero-Shot Multispeaker TTSCode0
LibriS2S: A German-English Speech-to-Speech Translation CorpusCode0
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
Learning Speaker Embedding from Text-to-SpeechCode0
Audio Super Resolution using Neural NetworksCode0
Learning High-Frequency Functions Made Easy with Sinusoidal Positional EncodingCode0
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the BibleCode0
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory FeaturesCode0
JSSS: free Japanese speech corpus for summarization and simplificationCode0
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic ProgrammingCode0
"I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen EntitiesCode0
IsoChronoMeter: A simple and effective isochronic translation evaluation metricCode0
Integrated Speech and Gesture SynthesisCode0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-SpeechCode0
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent languageCode0
CSS10: A Collection of Single Speaker Speech Datasets for 10 LanguagesCode0
Independent and automatic evaluation of acoustic-to-articulatory inversion modelsCode0
Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density NetworkCode0
Massively Multilingual Neural Grapheme-to-Phoneme ConversionCode0
Naturalization of Text by the Insertion of Pauses and Filler WordsCode0
High Fidelity Speech Synthesis with Adversarial NetworksCode0
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency GenerationCode0
Hierarchical Generative Modeling for Controllable Speech SynthesisCode0
Hierarchical Prosody Modeling for Non-Autoregressive Speech SynthesisCode0
Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-AlignmentCode0
Attentive Multi-Layer Perceptron for Non-autoregressive GenerationCode0
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech RecognitionCode0
Generating Synthetic Audio Data for Attention-Based Speech Recognition SystemsCode0
Show:102550
← PrevPage 7 of 29Next →

No leaderboard results yet.