SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 601650 of 1419 papers

TitleStatusHype
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-SpeechCode5
Towards Selection of Text-to-speech Data to Augment ASR Training0
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages0
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus0
Make-A-Voice: Unified Voice Synthesis With Discrete Representation0
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions0
Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis0
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationCode1
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTSCode1
An Efficient Membership Inference Attack for the Diffusion Model by Proximal InitializationCode1
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction0
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo ConversionCode0
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation0
Multilingual Text-to-Speech Synthesis for Turkic Languages Using TransliterationCode1
LAraBench: Benchmarking Arabic AI with Large Language Models0
EfficientSpeech: An On-Device Text to Speech ModelCode1
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models0
Text Generation with Speech Synthesis for ASR Data Augmentation0
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer0
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages0
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios0
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting0
Data Redaction from Conditional Generative Models0
Parameter-Efficient Learning for Text-to-Speech Accent AdaptationCode1
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data AugmentationCode1
Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found DataCode1
A unified front-end framework for English text-to-speech synthesis0
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs0
Controllable Speaking Styles Using a Large Language Model0
Better speech synthesis through scalingCode6
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency ModelCode2
Accented Text-to-Speech Synthesis with Limited Data0
Bts-e: Audio deepfake detection using breathing-talking-silence encoderCode1
M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis0
A Review of Deep Learning Techniques for Speech Processing0
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech SynthesisCode2
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model0
DiffVoice: Text-to-Speech with Latent Diffusion0
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained HubertCode4
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing SynthesizersCode2
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers0
Enhancing Speech-to-Speech Translation with Multiple TTS Targets0
An investigation of phrase break prediction in an End-to-End TTS systemCode0
ArmanTTS single-speaker Persian dataset0
Ensemble prosody prediction for expressive speech synthesis0
AraSpot: Arabic Spoken Command SpottingCode0
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource LanguagesCode1
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis0
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis0
Show:102550
← PrevPage 13 of 29Next →

No leaderboard results yet.