| Supervising 3D Talking Head Avatars with Analysis-by-Audio-Synthesis | Apr 18, 2025 | Audio Synthesis | —Unverified | 0 |
| TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis | Apr 8, 2025 | Audio SynthesisFAD | —Unverified | 0 |
| Designing Neural Synthesizers for Low-Latency Interaction | Mar 14, 2025 | Audio Synthesis | —Unverified | 0 |
| Long-Video Audio Synthesis with Multi-Agent Collaboration | Mar 13, 2025 | Audio SynthesisScene Segmentation | —Unverified | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 |
| XAttnMark: Learning Robust Audio Watermarking with Cross-Attention | Feb 6, 2025 | Audio SynthesisFace Swapping | —Unverified | 0 |
| Customized Condition Controllable Generation for Video Soundtrack | Jan 1, 2025 | Audio Synthesis | —Unverified | 0 |
| Tri-Ergon: Fine-grained Video-to-Audio Generation with Multi-modal Conditions and LUFS Control | Dec 29, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 |
| CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder | Dec 12, 2024 | Audio SynthesisSinging Voice Synthesis | —Unverified | 0 |
| Zero-Shot Mono-to-Binaural Speech Synthesis | Dec 11, 2024 | Audio SynthesisDenoising | —Unverified | 0 |