Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–650 of 1419 papers

Title	Date	Tasks	Status
Building a Luganda Text-to-Speech Model From Crowdsourced Data	May 16, 2024	Speech Enhancementtext-to-speech	—Unverified
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text	May 16, 2024	Code GenerationFace Generation	—Unverified
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer	May 15, 2024	Adversarial AttackAutomatic Speech Recognition	—Unverified
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset	May 14, 2024	DeepFake DetectionFace Swapping	CodeCode Available
Real-Time Pill Identification for the Visually Impaired Using Deep Learning	May 8, 2024	Deep LearningManagement	—Unverified
Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech	Apr 30, 2024	Decodertext-to-speech	—Unverified
TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality	Apr 27, 2024	Imputationtext-to-speech	—Unverified
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations	Apr 23, 2024	text-to-speechText to Speech	—Unverified
Retrieval-Augmented Audio Deepfake Detection	Apr 22, 2024	Audio Deepfake DetectionDeepFake Detection	—Unverified
Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling	Apr 14, 2024	Polyphone disambiguationText Normalization	—Unverified
Voice-Assisted Real-Time Traffic Sign Recognition System Using Convolutional Neural Network	Apr 11, 2024	Autonomous Vehiclestext-to-speech	—Unverified
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge	Apr 9, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Cross-Domain Audio Deepfake Detection: Dataset and Analysis	Apr 7, 2024	Audio Deepfake DetectionDeepFake Detection	—Unverified
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis	Apr 4, 2024	Language ModelingLanguage Modelling	—Unverified
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech	Apr 3, 2024	Language ModelingLanguage Modelling	—Unverified
PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders	Apr 3, 2024	Representation LearningSpeaker Verification	—Unverified
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation	Mar 31, 2024	Language ModelingLanguage Modelling	CodeCode Available
A Review of Multi-Modal Large Language and Vision Models	Mar 28, 2024	Image CaptioningPrompt Engineering	—Unverified
Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning	Mar 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations	Mar 17, 2024	Attributetext-to-speech	—Unverified
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech	Mar 13, 2024	GPUSpeech Synthesis	—Unverified
Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation	Mar 7, 2024	DiversityMachine Translation	—Unverified
AttentionStitch: How Attention Solves the Speech Editing Problem	Mar 5, 2024	text-to-speechText to Speech	—Unverified
Towards Accurate Lip-to-Speech Synthesis in-the-Wild	Mar 2, 2024	Language ModellingLip to Speech Synthesis	—Unverified
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data	Feb 29, 2024	Representation LearningSpeech Synthesis	—Unverified
Efficient data selection employing Semantic Similarity-based Graph Structures for model training	Feb 22, 2024	Semantic SimilaritySemantic Textual Similarity	—Unverified
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition	Feb 22, 2024	text-to-speechText to Speech	—Unverified
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models	Feb 19, 2024	DenoisingImage Generation	—Unverified
Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting	Feb 19, 2024	Language ModelingLanguage Modelling	CodeCode Available
Ain't Misbehavin' -- Using LLMs to Generate Expressive Robot Behavior in Conversations with the Tabletop Robot Haru	Feb 18, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech	Feb 14, 2024	DecoderGPU	—Unverified
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like	Feb 12, 2024	text-to-speechText to Speech	—Unverified
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data	Feb 12, 2024	DecoderDisentanglement	—Unverified
A New Approach to Voice Authenticity	Feb 9, 2024	text-to-speechText to Speech	—Unverified
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations	Feb 5, 2024	DecoderIn-Context Learning	—Unverified
Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech	Feb 1, 2024	text-to-speechText to Speech	—Unverified
MunTTS: A Text-to-Speech System for Mundari	Jan 28, 2024	Speech Synthesistext-to-speech	—Unverified
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech	Jan 25, 2024	DecoderHallucination	—Unverified
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization	Jan 23, 2024	text-to-speechText to Speech	—Unverified
Adversarial speech for voice privacy protection from Personalized Speech generation	Jan 22, 2024	Speaker Verificationtext-to-speech	—Unverified
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis	Jan 22, 2024	Speaker VerificationSpeech Synthesis	—Unverified
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech	Jan 19, 2024	Self-Supervised Learningtext-to-speech	—Unverified
MCMChaos: Improvising Rap Music with MCMC Methods and Chaos Theory	Jan 15, 2024	Music Generationtext-to-speech	—Unverified
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering	Jan 14, 2024	Audio GenerationLanguage Modeling	—Unverified
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec2	Jan 11, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters	Jan 10, 2024	Self-Supervised LearningSpeech Enhancement	—Unverified
Evaluating and Personalizing User-Perceived Quality of Text-to-Speech Voices for Delivering Mindfulness Meditation with Different Physical Embodiments	Jan 7, 2024	text-to-speechText to Speech	—Unverified
Transfer the linguistic representations from TTS to accent conversion with non-parallel data	Jan 7, 2024	text-to-speechText to Speech	—Unverified
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction	Jan 3, 2024	text-to-speechText to Speech	—Unverified
Incremental FastPitch: Chunk-based High Quality Text to Speech	Jan 3, 2024	Speech Synthesistext-to-speech	—Unverified

Show:10 25 50

← PrevPage 13 of 29Next →

No leaderboard results yet.