| Addressing Emotion Bias in Music Emotion Recognition and Generation with Frechet Audio Distance | Sep 23, 2024 | Emotion RecognitionFAD | CodeCode Available | 3 | 5 |
| MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation | Dec 19, 2022 | cross-modal alignmentDenoising | CodeCode Available | 2 | 5 |
| L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection | Aug 7, 2024 | 3D Object DetectionAutonomous Navigation | CodeCode Available | 2 | 5 |
| Adapting Frechet Audio Distance for Generative Music Evaluation | Nov 2, 2023 | FAD | CodeCode Available | 2 | 5 |
| MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models | Jun 7, 2024 | FADText-to-Music Generation | CodeCode Available | 2 | 5 |
| Taming Data and Transformers for Audio Generation | Jun 27, 2024 | Audio captioningAudio Generation | CodeCode Available | 2 | 5 |
| FlowDec: A flow-based full-band general audio codec with high perceptual quality | Mar 3, 2025 | FAD | CodeCode Available | 2 | 5 |
| Efficient Autoregressive Audio Modeling via Next-Scale Prediction | Aug 16, 2024 | Audio GenerationFAD | CodeCode Available | 2 | 5 |
| KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation | Feb 21, 2025 | Audio GenerationFAD | CodeCode Available | 2 | 5 |
| Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation | Dec 10, 2024 | FADMusic Generation | CodeCode Available | 1 | 5 |
| DOSE : Drum One-Shot Extraction from Music Mixture | Apr 25, 2025 | FAD | CodeCode Available | 1 | 5 |
| Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks | Sep 5, 2021 | 8kFAD | CodeCode Available | 1 | 5 |
| Multi-Source Music Generation with Latent Diffusion | Sep 10, 2024 | FADMusic Generation | CodeCode Available | 1 | 5 |
| BemaGANv2: A Tutorial and Comparative Survey of GAN-based Vocoders for Long-Term Audio Generation | Jun 11, 2025 | Audio GenerationFAD | CodeCode Available | 1 | 5 |
| AMSP-UOD: When Vortex Convolution and Stochastic Perturbation Meet Underwater Object Detection | Aug 23, 2023 | FADObject | CodeCode Available | 1 | 5 |
| Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization | Mar 28, 2025 | Audio GenerationFAD | CodeCode Available | 1 | 5 |
| Representation Sharing for Fast Object Detector Search and Beyond | Jul 23, 2020 | FADGPU | CodeCode Available | 1 | 5 |
| Aligning Text-to-Music Evaluation with Human Preferences | Mar 20, 2025 | FAD | CodeCode Available | 1 | 5 |
| Twitch Plays Pokemon, Machine Learns Twitch: Unsupervised Context-Aware Anomaly Detection for Identifying Trolls in Streaming Data | Feb 17, 2019 | Anomaly DetectionClustering | CodeCode Available | 0 | 5 |
| Adapting Offline Speech Translation Models for Streaming with Future-Aware Distillation and Inference | Mar 14, 2023 | FADTranslation | CodeCode Available | 0 | 5 |
| Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning | Nov 28, 2022 | FADVideo Captioning | CodeCode Available | 0 | 5 |
| AnoPLe: Few-Shot Anomaly Detection via Bi-directional Prompt Learning with Only Normal Samples | Aug 24, 2024 | Anomaly DetectionDecoder | CodeCode Available | 0 | 5 |
| Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI | May 29, 2024 | FAD | CodeCode Available | 0 | 5 |
| Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms | Jun 25, 2022 | FAD | CodeCode Available | 0 | 5 |
| CLOTH4D: A Dataset for Clothed Human Reconstruction | Jan 1, 2023 | FAD | CodeCode Available | 0 | 5 |
| Latent CLAP Loss for Better Foley Sound Synthesis | Mar 18, 2024 | FAD | CodeCode Available | 0 | 5 |
| Phase asymmetry guided adaptive fractional-order total variation and diffusion for feature-preserving ultrasound despeckling | Oct 30, 2018 | FAD | —Unverified | 0 | 0 |
| Predicting Personal Traits from Facial Images using Convolutional Neural Networks Augmented with Facial Landmark Information | May 29, 2016 | AttributeFAD | —Unverified | 0 | 0 |
| Quantum Machine Learning: Fad or Future? | Jun 20, 2021 | BIG-bench Machine LearningFAD | —Unverified | 0 | 0 |
| RenderBox: Expressive Performance Rendering with Text Control | Feb 11, 2025 | DiversityFAD | —Unverified | 0 | 0 |
| Responding to Illegal Activities Along the Canadian Coastlines Using Reinforcement Learning | Aug 5, 2021 | FADreinforcement-learning | —Unverified | 0 | 0 |
| Retrieval-Augmented Text-to-Audio Generation | Sep 14, 2023 | AudioCapsAudio Generation | —Unverified | 0 | 0 |
| Sensing Performance of Multi-Channel RFID-based Finger Augmentation Devices for Tactile Internet | May 24, 2022 | FAD | —Unverified | 0 | 0 |
| Sound Scene Synthesis at the DCASE 2024 Challenge | Jan 15, 2025 | FAD | —Unverified | 0 | 0 |
| TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis | Apr 8, 2025 | Audio SynthesisFAD | —Unverified | 0 | 0 |
| Tuna-AI: tuna biomass estimation with Machine Learning models trained on oceanography and echosounder FAD data | Sep 14, 2021 | FAD | —Unverified | 0 | 0 |
| Market Making with Fads, Informed, and Uninformed Traders | Jan 7, 2025 | FAD | —Unverified | 0 | 0 |
| A Fast Automatic Method for Deconvoluting Macro X-ray Fluorescence Data Collected from Easel Paintings | Oct 31, 2022 | FAD | —Unverified | 0 | 0 |
| A General Framework for Learning Procedural Audio Models of Environmental Sounds | Mar 4, 2023 | FAD | —Unverified | 0 | 0 |
| A Study on Robustness to Perturbations for Representations of Environmental Sound | Mar 20, 2022 | FADTransfer Learning | —Unverified | 0 | 0 |
| Audiobox: Unified Audio Generation with Natural Language Prompts | Dec 25, 2023 | AudioCapsAudio Generation | —Unverified | 0 | 0 |
| Braille-to-Speech Generator: Audio Generation Based on Joint Fine-Tuning of CLIP and Fastspeech2 | Jul 19, 2024 | Audio GenerationAudio Synthesis | —Unverified | 0 | 0 |
| Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings | Sep 12, 2024 | FADImage Captioning | —Unverified | 0 | 0 |
| Detecting immune cells with label-free two-photon autofluorescence and deep learning | Jun 17, 2025 | Binary ClassificationClassification | —Unverified | 0 | 0 |
| Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning | Jan 24, 2025 | FADLanguage Modeling | —Unverified | 0 | 0 |
| DRAGON: Distributional Rewards Optimize Diffusion Generative Models | Apr 21, 2025 | FAD | —Unverified | 0 | 0 |
| Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion | Apr 29, 2025 | Action GenerationFAD | —Unverified | 0 | 0 |
| Enhancing U.S. swine farm preparedness for infectious foreign animal diseases with rapid access to biosecurity information | Apr 12, 2025 | FAD | —Unverified | 0 | 0 |
| Exploring compressibility of transformer based text-to-music (TTM) models | Jun 24, 2024 | DecoderFAD | —Unverified | 0 | 0 |
| FaceCat: Enhancing Face Recognition Security with a Unified Diffusion Model | Apr 14, 2024 | Face Anti-SpoofingFace Recognition | —Unverified | 0 | 0 |