| Singing Synthesis: with a little help from my attention | Dec 12, 2019 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow | Apr 10, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs | Jul 18, 2023 | Generative Adversarial NetworkLanguage Modeling | —Unverified | 0 | 0 |
| Smart Summarizer for Blind People | Jan 1, 2020 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech | Nov 30, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| SNIPER Training: Single-Shot Sparse Training for Text-to-Speech | Nov 14, 2022 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| SoK: A Study of the Security on Voice Processing Systems | Dec 24, 2021 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis | Apr 6, 2022 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Source Tracing of Audio Deepfake Systems | Jul 10, 2024 | Face Swappingtext-to-speech | —Unverified | 0 | 0 |
| MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis | Feb 26, 2025 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation | Apr 7, 2025 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Speaker-adaptive neural vocoders for parametric speech synthesis systems | Nov 8, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| Speaker Generation | Nov 7, 2021 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Speaker-independent raw waveform model for glottal excitation | Apr 25, 2018 | modelSpeech Synthesis | —Unverified | 0 | 0 |
| Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis | Jun 3, 2021 | Data AugmentationSpeaker Verification | —Unverified | 0 | 0 |
| Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention | Oct 29, 2018 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| SpeakStream: Streaming Text-to-Speech with Interleaved Data | May 25, 2025 | Decodertext-to-speech | —Unverified | 0 | 0 |
| Speak While You Think: Streaming Speech Synthesis During Text Generation | Sep 20, 2023 | Speech SynthesisText Generation | —Unverified | 0 | 0 |
| Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs | Jun 7, 2024 | QuantizationSpeech Synthesis | —Unverified | 0 | 0 |
| Speculative End-Turn Detector for Efficient Speech Chatbot Assistant | Mar 30, 2025 | ChatbotCollaborative Inference | —Unverified | 0 | 0 |
| Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction | May 8, 2013 | Speech SynthesisSpeech-to-Text | —Unverified | 0 | 0 |
| Speech Aware Dialog System Technology Challenge (DSTC11) | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Speech Bandwidth Expansion Via High Fidelity Generative Adversarial Networks | Jul 26, 2024 | Generative Adversarial NetworkSpeech Enhancement | —Unverified | 0 | 0 |
| Speech BERT Embedding For Improving Prosody in Neural TTS | Jun 8, 2021 | Decodertext-to-speech | —Unverified | 0 | 0 |
| Speech denoising by parametric resynthesis | Apr 2, 2019 | DenoisingResynthesis | —Unverified | 0 | 0 |
| Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody? | Oct 31, 2024 | Rhythmspeech-recognition | —Unverified | 0 | 0 |
| Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis | Jul 8, 2025 | Data AugmentationMixture-of-Experts | —Unverified | 0 | 0 |
| Speech Synthesis along Perceptual Voice Quality Dimensions | Jan 15, 2025 | Expressive Speech SynthesisSpeech Synthesis | —Unverified | 0 | 0 |
| Speech Synthesis for Low Resource Languages using Transliteration Enabled Transfer Learning | Nov 16, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Speech Synthesis of Code-Mixed Text | May 1, 2016 | Language IdentificationSpeech Synthesis | —Unverified | 0 | 0 |
| Speech Synthesis with Mixed Emotions | Aug 11, 2022 | AttributeEmotional Speech Synthesis | —Unverified | 0 | 0 |
| Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Speech to Speech Translation with Translatotron: A State of the Art Review | Feb 9, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Speech to text and text to speech recognition systems-Areview | Mar 17, 2018 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| Speech-T: Transducer for Text to Speech and Beyond | Dec 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Speech vocoding for laboratory phonology | Jan 22, 2016 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| SpeechX: Neural Codec Language Model as a Versatile Speech Transformer | Aug 14, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| SpMis: An Investigation of Synthetic Spoken Misinformation Detection | Sep 17, 2024 | Misinformationtext-to-speech | —Unverified | 0 | 0 |
| Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models | Jul 18, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| SpoofCeleb: Speech Deepfake Detection and SASV In The Wild | Sep 18, 2024 | DeepFake DetectionDiversity | —Unverified | 0 | 0 |
| Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech | May 27, 2025 | Style Transfertext-to-speech | —Unverified | 0 | 0 |
| SQuId: Measuring Speech Naturalness in Many Languages | Oct 12, 2022 | Diversitytext-to-speech | —Unverified | 0 | 0 |
| kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech | Aug 20, 2024 | RetrievalSelf-Supervised Learning | —Unverified | 0 | 0 |
| Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting | Dec 28, 2024 | Speech Synthesistext-to-speech | —Unverified | 0 | 0 |
| StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations | Apr 23, 2024 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement | Jun 19, 2025 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation | Feb 4, 2025 | Change DetectionGender Classification | —Unverified | 0 | 0 |
| StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling | Jun 14, 2025 | text-to-speechText to Speech | —Unverified | 0 | 0 |
| Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text Corpus | Jan 30, 2017 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Structured State Space Decoder for Speech Recognition and Synthesis | Oct 31, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |