SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 11511200 of 1419 papers

TitleStatusHype
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform0
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation0
Sadeed: Advancing Arabic Diacritization Through Small Language Model0
SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction0
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation0
SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis0
Sample Efficient Adaptive Text-to-Speech0
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech0
SANIP: Shopping Assistant and Navigation for the visually impaired0
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate0
Scalable Multilingual Frontend for TTS0
Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling0
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis0
SeamlessEdit: Background Noise Aware Zero-Shot Speech Editing with in-Context Enhancement0
Seeing Voices: Generating A-Roll Video from Audio with Mirage0
SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech0
Segmentation-Variant Codebooks for Preservation of Paralinguistic and Prosodic Information0
SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection0
Self-Attention Linguistic-Acoustic Decoder0
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text0
Semi-Supervised Generative Modeling for Controllable Speech Synthesis0
Semi-Supervised Learning Based on Reference Model for Low-resource TTS0
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation0
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis0
Semi-supervised transfer learning for language expansion of end-to-end speech recognition models to low-resource languages0
Sentence Based Discourse Classification for Hindi Story Text-to-Speech (TTS) System0
Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis0
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody0
SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models0
Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS0
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation0
Singing Synthesis: with a little help from my attention0
SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow0
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs0
Smart Summarizer for Blind People0
SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech0
SNIPER Training: Single-Shot Sparse Training for Text-to-Speech0
SoK: A Study of the Security on Voice Processing Systems0
SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis0
Source Tracing of Audio Deepfake Systems0
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis0
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation0
Speaker-adaptive neural vocoders for parametric speech synthesis systems0
Speaker Generation0
Speaker-independent raw waveform model for glottal excitation0
Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis0
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention0
SpeakStream: Streaming Text-to-Speech with Interleaved Data0
Speak While You Think: Streaming Speech Synthesis During Text Generation0
Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs0
Show:102550
← PrevPage 24 of 29Next →

No leaderboard results yet.