| StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation | Aug 31, 2023 | Style TransferUnconditional Video Generation | CodeCode Available | 1 | 5 |
| MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling | May 28, 2024 | Video Generation | CodeCode Available | 1 | 5 |
| Mask-conditioned latent diffusion for generating gastrointestinal polyp images | Apr 11, 2023 | Image GenerationImage Segmentation | CodeCode Available | 1 | 5 |
| MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation | Aug 1, 2020 | Face GenerationTalking Face Generation | CodeCode Available | 1 | 5 |
| AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark | Mar 18, 2025 | Video Generation | CodeCode Available | 1 | 5 |
| Make It Move: Controllable Image-to-Video Generation with Text Descriptions | Dec 6, 2021 | DiversityImage to Video Generation | CodeCode Available | 1 | 5 |
| SLAMP: Stochastic Latent Appearance and Motion Prediction | Aug 5, 2021 | Autonomous Drivingmotion prediction | CodeCode Available | 1 | 5 |
| MagicStick: Controllable Video Editing via Control Handle Transformations | Dec 5, 2023 | Video EditingVideo Generation | CodeCode Available | 1 | 5 |
| Compositional Video Synthesis with Action Graphs | Jun 27, 2020 | SchedulingVideo Generation | CodeCode Available | 1 | 5 |
| EG4D: Explicit Generation of 4D Object without Score Distillation | May 28, 2024 | Dynamic ReconstructionVideo Generation | CodeCode Available | 1 | 5 |
| AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM | Nov 26, 2024 | BenchmarkingText-to-Video Generation | CodeCode Available | 1 | 5 |
| SinFusion: Training Diffusion Models on a Single Image or Video | Nov 21, 2022 | DiversityImage Manipulation | CodeCode Available | 1 | 5 |
| Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models | May 10, 2023 | Text-to-Video GenerationVideo Generation | CodeCode Available | 1 | 5 |
| Sliced Wasserstein Generative Models | Jun 8, 2017 | Image GenerationVideo Generation | CodeCode Available | 1 | 5 |
| EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models | Mar 25, 2025 | Video Generation | CodeCode Available | 1 | 5 |
| Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Oct 15, 2024 | Image Generationmultimodal generation | CodeCode Available | 1 | 5 |
| Editable Free-viewpoint Video Using a Layered Neural Representation | Apr 30, 2021 | DisentanglementNeRF | CodeCode Available | 1 | 5 |
| EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion | Jan 23, 2025 | Video Generation | CodeCode Available | 1 | 5 |
| AICL: Action In-Context Learning for Video Diffusion Model | Mar 18, 2024 | Action GenerationIn-Context Learning | CodeCode Available | 1 | 5 |
| ECHOPulse: ECG controlled echocardio-grams video generation | Oct 4, 2024 | Video Generation | CodeCode Available | 1 | 5 |
| EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing | Jun 2, 2024 | De-identificationPrivacy Preserving | CodeCode Available | 1 | 5 |
| Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning | Mar 4, 2022 | Self-LearningText Augmentation | CodeCode Available | 1 | 5 |
| A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation | Sep 26, 2024 | Inductive BiasVideo Generation | CodeCode Available | 1 | 5 |
| LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation | May 17, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 | 5 |
| E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning | Jan 16, 2024 | Video Generation | CodeCode Available | 1 | 5 |
| MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving | Mar 20, 2025 | Autonomous DrivingDenoising | CodeCode Available | 1 | 5 |
| Make-A-Video: Text-to-Video Generation without Text-Video Data | Sep 29, 2022 | DecoderImage Generation | CodeCode Available | 1 | 5 |
| Sliced Wasserstein Generative Models | Apr 10, 2019 | Image GenerationVideo Generation | CodeCode Available | 1 | 5 |
| Click to Move: Controlling Video Generation with Sparse Motion | Aug 19, 2021 | Video Generation | CodeCode Available | 1 | 5 |
| LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models | Sep 26, 2023 | Super-ResolutionText-to-Video Generation | CodeCode Available | 1 | 5 |
| Latent Video Transformer | Jun 18, 2020 | Video GenerationVideo Prediction | CodeCode Available | 1 | 5 |
| ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance | May 27, 2024 | Diffusion PersonalizationVideo Generation | CodeCode Available | 1 | 5 |
| Latent Image Animator: Learning to animate image via latent space navigation | Sep 29, 2021 | Image AnimationVideo Generation | CodeCode Available | 1 | 5 |
| CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation | May 21, 2025 | Video Generation | CodeCode Available | 1 | 5 |
| Latent Neural Differential Equations for Video Generation | Nov 7, 2020 | Unconditional Video GenerationVideo Generation | CodeCode Available | 1 | 5 |
| DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation | Apr 9, 2025 | Image GenerationText to Image Generation | CodeCode Available | 1 | 5 |
| DwNet: Dense warp-based network for pose-guided human video generation | Oct 21, 2019 | Video Generation | CodeCode Available | 1 | 5 |
| A Good Image Generator Is What You Need for High-Resolution Video Synthesis | Apr 30, 2021 | Video Generation | CodeCode Available | 1 | 5 |
| DVD-Quant: Data-free Video Diffusion Transformers Quantization | May 24, 2025 | Data Free QuantizationQuantization | CodeCode Available | 1 | 5 |
| DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation | May 23, 2023 | Text-to-Video GenerationVideo Generation | CodeCode Available | 1 | 5 |
| DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Mar 5, 2025 | 3D Object DetectionBEV Segmentation | CodeCode Available | 1 | 5 |
| InTraGen: Trajectory-controlled Video Generation for Object Interactions | Nov 25, 2024 | ObjectVideo Generation | CodeCode Available | 1 | 5 |
| DTVNet: Dynamic Time-lapse Video Generation via Single Still Image | Aug 11, 2020 | DecoderOptical Flow Estimation | CodeCode Available | 1 | 5 |
| A Light and Tuning-free Method for Simulating Camera Motion in Video Generation | Mar 9, 2025 | DenoisingDepth Estimation | CodeCode Available | 1 | 5 |
| SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers | Dec 20, 2022 | DecoderDenoising | CodeCode Available | 1 | 5 |
| DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation | Mar 8, 2025 | Video Generation | CodeCode Available | 1 | 5 |
| Scaling Autoregressive Video Models | Jun 6, 2019 | Action RecognitionVideo Generation | CodeCode Available | 1 | 5 |
| Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework | Sep 19, 2024 | Motion CompensationVideo Generation | CodeCode Available | 1 | 5 |
| Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining | Oct 1, 2024 | Atari Gamesmodel | CodeCode Available | 1 | 5 |
| Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search | Jan 31, 2025 | DenoisingVideo Alignment | CodeCode Available | 1 | 5 |