Probing Speaker-specific Features in Speaker Representations Jan 9, 2025 Self-Supervised Learning Speaker Verification
— Unverified 00 A Multi-Agent Framework for Automated Qinqiang Opera Script Generation Using Large Language Models Apr 22, 2025 cross-modal alignment Script Generation
— Unverified 00 PROEMO: Prompt-Driven Text-to-Speech Synthesis Based on Emotion and Intensity Control Jan 10, 2025 Speech Synthesis text-to-speech
— Unverified 00 PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders Apr 3, 2024 Representation Learning Speaker Verification
— Unverified 00 ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis Dec 16, 2024 Speech Synthesis text-to-speech
— Unverified 00 Prosody-TTS: An end-to-end speech synthesis system with prosody control Oct 6, 2021 Rhythm Speech Synthesis
— Unverified 00 Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis Apr 14, 2025 Language Modeling Language Modelling
— Unverified 00 Punjabi Text-To-Speech Synthesis System Dec 1, 2012 Speech Synthesis text-to-speech
— Unverified 00 Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder Jul 31, 2018 Generative Adversarial Network Speech Synthesis
— Unverified 00 Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks Oct 30, 2018 Image Generation Speech Synthesis
— Unverified 00 RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis Apr 4, 2024 Language Modeling Language Modelling
— Unverified 00 Real-time Incremental Speech-to-Speech Translation of Dialogs Jun 1, 2012 Machine Translation Speech Recognition
— Unverified 00 ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence May 9, 2022 Speech Synthesis text-to-speech
— Unverified 00 Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images Sep 1, 2017 Referring Expression Referring expression generation
— Unverified 00 Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability Apr 3, 2021 Emotion Recognition reinforcement-learning
— Unverified 00 DLPO: Diffusion Model Loss-Guided Reinforcement Learning for Fine-Tuning Text-to-Speech Diffusion Models May 23, 2024 Image Generation reinforcement-learning
— Unverified 00 ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 00 ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 00 Revival with Voice: Multi-modal Controllable Text-to-Speech Synthesis May 25, 2025 Speech Synthesis text-to-speech
— Unverified 00 R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Jun 30, 2022 Decoder GPU
— Unverified 00 Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization Jul 2, 2024 Inference Optimization Speech Synthesis
— Unverified 00 RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus May 1, 2014 Speech Synthesis text-to-speech
— Unverified 00 Russian Stress Prediction using Maximum Entropy Ranking Oct 1, 2013 Machine Translation Prediction
— Unverified 00 Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling May 26, 2025 Sentence Speech Synthesis
— Unverified 00 Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model Apr 24, 2023 Rhythm Self-Supervised Learning
— Unverified 00 S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation Jun 11, 2025 Reading Comprehension Speech Synthesis
— Unverified 00 SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction Jun 2, 2025 Speech Synthesis text-to-speech
— Unverified 00 SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis Aug 2, 2023 Decoder Self-Supervised Learning
— Unverified 00 Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input Feb 19, 2021 Language Modeling Language Modelling
— Unverified 00 Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Dec 6, 2023 Speech Synthesis text-to-speech
— Unverified 00 Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation May 16, 2020 Decoder Speech Synthesis
— Unverified 00 Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain Jun 3, 2019 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models May 23, 2023 Speech Synthesis text-to-speech
— Unverified 00 Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis May 18, 2025 Speech Synthesis text-to-speech
— Unverified 00 Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS Nov 10, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs Jul 18, 2023 Generative Adversarial Network Language Modeling
— Unverified 00 SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis Apr 6, 2022 Speech Synthesis text-to-speech
— Unverified 00 Speaker-independent raw waveform model for glottal excitation Apr 25, 2018 model Speech Synthesis
— Unverified 00 Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis Jun 3, 2021 Data Augmentation Speaker Verification
— Unverified 00 Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies Jul 1, 2015 Coreference Resolution Named Entity Recognition (NER)
— Unverified 00 Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention Oct 29, 2018 Speech Synthesis text-to-speech
— Unverified 00 Speech Bandwidth Expansion Via High Fidelity Generative Adversarial Networks Jul 26, 2024 Generative Adversarial Network Speech Enhancement
— Unverified 00 Speech denoising by parametric resynthesis Apr 2, 2019 Denoising Resynthesis
— Unverified 00 WebWOZ: A Platform for Designing and Conducting Web-based Wizard of Oz Experiments Aug 1, 2013 Machine Translation Speech Recognition
— Unverified 00 Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models Jul 18, 2024 Language Modeling Language Modelling
— Unverified 00 A distributed cloud-based dialog system for conversational application development Sep 1, 2015 Speech Recognition Speech Synthesis
— Unverified 00 Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting Dec 28, 2024 Speech Synthesis text-to-speech
— Unverified 00 Adaptive Parser-Centric Text Normalization Aug 1, 2013 Machine Translation Speech Recognition
— Unverified 00 StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis Sep 24, 2024 Speech Synthesis text-to-speech
— Unverified 00 Style Mixture of Experts for Expressive Text-To-Speech Synthesis Jun 5, 2024 Mixture-of-Experts Speech Synthesis
— Unverified 00