| TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis | Apr 8, 2025 | Audio SynthesisFAD | —Unverified | 0 |
| RenderBox: Expressive Performance Rendering with Text Control | Feb 11, 2025 | DiversityFAD | —Unverified | 0 |
| Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning | Jan 24, 2025 | FADLanguage Modeling | —Unverified | 0 |
| Sound Scene Synthesis at the DCASE 2024 Challenge | Jan 15, 2025 | FAD | —Unverified | 0 |
| Market Making with Fads, Informed, and Uninformed Traders | Jan 7, 2025 | FAD | —Unverified | 0 |
| MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System | Sep 27, 2024 | Anomaly DetectionFAD | —Unverified | 0 |
| Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings | Sep 12, 2024 | FADImage Captioning | —Unverified | 0 |
| Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer | Sep 9, 2024 | FAD | —Unverified | 0 |
| AnoPLe: Few-Shot Anomaly Detection via Bi-directional Prompt Learning with Only Normal Samples | Aug 24, 2024 | Anomaly DetectionDecoder | CodeCode Available | 0 |
| Braille-to-Speech Generator: Audio Generation Based on Joint Fine-Tuning of CLIP and Fastspeech2 | Jul 19, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 |
| Exploring compressibility of transformer based text-to-music (TTM) models | Jun 24, 2024 | DecoderFAD | —Unverified | 0 |
| Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI | May 29, 2024 | FAD | CodeCode Available | 0 |
| FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method | Apr 28, 2024 | Action DetectionActivity Detection | —Unverified | 0 |
| FaceCat: Enhancing Face Recognition Security with a Unified Diffusion Model | Apr 14, 2024 | Face Anti-SpoofingFace Recognition | —Unverified | 0 |
| Latent CLAP Loss for Better Foley Sound Synthesis | Mar 18, 2024 | FAD | CodeCode Available | 0 |
| MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction | Jan 24, 2024 | FAD | —Unverified | 0 |
| Audiobox: Unified Audio Generation with Natural Language Prompts | Dec 25, 2023 | AudioCapsAudio Generation | —Unverified | 0 |
| Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis | Sep 21, 2023 | FADInformation Retrieval | —Unverified | 0 |
| Retrieval-Augmented Text-to-Audio Generation | Sep 14, 2023 | AudioCapsAudio Generation | —Unverified | 0 |
| Flatness-Aware Minimization for Domain Generalization | Jul 20, 2023 | Domain GeneralizationFAD | —Unverified | 0 |
| Feature Adversarial Distillation for Point Cloud Classification | Jun 25, 2023 | ClassificationFAD | —Unverified | 0 |
| Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference | Mar 14, 2023 | FADTranslation | CodeCode Available | 0 |
| A General Framework for Learning Procedural Audio Models of Environmental Sounds | Mar 4, 2023 | FAD | —Unverified | 0 |
| Federated Automatic Differentiation | Jan 18, 2023 | FADFederated Learning | —Unverified | 0 |
| CLOTH4D: A Dataset for Clothed Human Reconstruction | Jan 1, 2023 | FAD | CodeCode Available | 0 |