| Conditional diffusion model with spatial attention and latent embedding for medical image segmentation | Feb 10, 2025 | HippocampusImage Segmentation | CodeCode Available | 1 |
| FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge | May 17, 2025 | Image GenerationScheduling | CodeCode Available | 1 |
| Patch-based Object-centric Transformers for Efficient Video Generation | Jun 8, 2022 | ObjectVideo Editing | CodeCode Available | 1 |
| Playable Video Generation | Jan 28, 2021 | DecoderVideo Generation | CodeCode Available | 1 |
| SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers | Dec 20, 2022 | DecoderDenoising | CodeCode Available | 1 |
| StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation | Aug 31, 2023 | Style TransferUnconditional Video Generation | CodeCode Available | 1 |
| OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models | Nov 15, 2024 | Optical Flow EstimationText-to-Video Generation | CodeCode Available | 1 |
| OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | May 10, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 1 |
| OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models | May 21, 2024 | Video Generation | CodeCode Available | 1 |
| NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion | Nov 24, 2021 | DecoderImage Generation | CodeCode Available | 1 |
| AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark | Mar 18, 2025 | Video Generation | CodeCode Available | 1 |
| Object-Centric Image to Video Generation with Language Guidance | Feb 17, 2025 | Image to Video GenerationObject | CodeCode Available | 1 |
| EG4D: Explicit Generation of 4D Object without Score Distillation | May 28, 2024 | Dynamic ReconstructionVideo Generation | CodeCode Available | 1 |
| Non-linear Motion Estimation for Video Frame Interpolation using Space-time Convolutions | Jan 27, 2022 | Motion EstimationVideo Frame Interpolation | CodeCode Available | 1 |
| Temporal Shift GAN for Large Scale Video Generation | Apr 4, 2020 | Video Generation | CodeCode Available | 1 |
| AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM | Nov 26, 2024 | BenchmarkingText-to-Video Generation | CodeCode Available | 1 |
| Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy | Jun 15, 2021 | DescriptiveGenerative Adversarial Network | CodeCode Available | 1 |
| MVOC: a training-free multiple video object composition method with diffusion models | Jun 22, 2024 | Image to Video GenerationObject | CodeCode Available | 1 |
| EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models | Mar 25, 2025 | Video Generation | CodeCode Available | 1 |
| Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Oct 15, 2024 | Image Generationmultimodal generation | CodeCode Available | 1 |
| Editable Free-viewpoint Video Using a Layered Neural Representation | Apr 30, 2021 | DisentanglementNeRF | CodeCode Available | 1 |
| Compositional Video Synthesis with Action Graphs | Jun 27, 2020 | SchedulingVideo Generation | CodeCode Available | 1 |
| EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion | Jan 23, 2025 | Video Generation | CodeCode Available | 1 |
| AICL: Action In-Context Learning for Video Diffusion Model | Mar 18, 2024 | Action GenerationIn-Context Learning | CodeCode Available | 1 |
| ECHOPulse: ECG controlled echocardio-grams video generation | Oct 4, 2024 | Video Generation | CodeCode Available | 1 |
| EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing | Jun 2, 2024 | De-identificationPrivacy Preserving | CodeCode Available | 1 |
| MOSO: Decomposing MOtion, Scene and Object for Video Prediction | Mar 7, 2023 | ObjectUnconditional Video Generation | CodeCode Available | 1 |
| A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation | Sep 26, 2024 | Inductive BiasVideo Generation | CodeCode Available | 1 |
| MoStGAN-V: Video Generation with Temporal Motion Styles | Apr 5, 2023 | Video Generation | CodeCode Available | 1 |
| MotionCrafter: One-Shot Motion Customization of Diffusion Models | Dec 8, 2023 | DisentanglementMotion Disentanglement | CodeCode Available | 1 |
| Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | Jan 3, 2024 | Image AnimationVideo Editing | CodeCode Available | 1 |
| E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning | Jan 16, 2024 | Video Generation | CodeCode Available | 1 |
| MotionCraft: Physics-based Zero-Shot Video Generation | May 22, 2024 | Image GenerationMissing Elements | CodeCode Available | 1 |
| Click to Move: Controlling Video Generation with Sparse Motion | Aug 19, 2021 | Video Generation | CodeCode Available | 1 |
| MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation | Oct 2, 2024 | Video Generation | CodeCode Available | 1 |
| ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance | May 27, 2024 | Diffusion PersonalizationVideo Generation | CodeCode Available | 1 |
| MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation | May 29, 2025 | Motion GenerationVideo Generation | CodeCode Available | 1 |
| MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance | Jun 28, 2024 | Image GenerationVideo Generation | CodeCode Available | 1 |
| CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation | May 21, 2025 | Video Generation | CodeCode Available | 1 |
| Minute-Long Videos with Dual Parallelisms | May 27, 2025 | DenoisingGPU | CodeCode Available | 1 |
| DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation | Apr 9, 2025 | Image GenerationText to Image Generation | CodeCode Available | 1 |
| DwNet: Dense warp-based network for pose-guided human video generation | Oct 21, 2019 | Video Generation | CodeCode Available | 1 |
| A Good Image Generator Is What You Need for High-Resolution Video Synthesis | Apr 30, 2021 | Video Generation | CodeCode Available | 1 |
| A Light and Tuning-free Method for Simulating Camera Motion in Video Generation | Mar 9, 2025 | DenoisingDepth Estimation | CodeCode Available | 1 |
| DVD-Quant: Data-free Video Diffusion Transformers Quantization | May 24, 2025 | Data Free QuantizationQuantization | CodeCode Available | 1 |
| MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling | May 28, 2024 | Video Generation | CodeCode Available | 1 |
| Mask-conditioned latent diffusion for generating gastrointestinal polyp images | Apr 11, 2023 | Image GenerationImage Segmentation | CodeCode Available | 1 |
| MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation | Aug 1, 2020 | Face GenerationTalking Face Generation | CodeCode Available | 1 |
| DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Mar 5, 2025 | 3D Object DetectionBEV Segmentation | CodeCode Available | 1 |
| Make-A-Video: Text-to-Video Generation without Text-Video Data | Sep 29, 2022 | DecoderImage Generation | CodeCode Available | 1 |