| FlexDiT: Dynamic Token Density Control for Diffusion Transformer | Dec 8, 2024 | Computational EfficiencyDenoising | CodeCode Available | 1 |
| Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model | Nov 28, 2024 | DenoisingVideo Generation | CodeCode Available | 1 |
| AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM | Nov 26, 2024 | BenchmarkingText-to-Video Generation | CodeCode Available | 1 |
| InTraGen: Trajectory-controlled Video Generation for Object Interactions | Nov 25, 2024 | ObjectVideo Generation | CodeCode Available | 1 |
| StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart | Nov 21, 2024 | Video Generation | CodeCode Available | 1 |
| PoM: Efficient Image and Video Generation with the Polynomial Mixer | Nov 19, 2024 | Video Generation | CodeCode Available | 1 |
| OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models | Nov 15, 2024 | Optical Flow EstimationText-to-Video Generation | CodeCode Available | 1 |
| Fast and Memory-Efficient Video Diffusion Using Streamlined Inference | Nov 2, 2024 | GPUVideo Generation | CodeCode Available | 1 |
| Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning | Oct 31, 2024 | Motion SynthesisText-to-Video Generation | CodeCode Available | 1 |
| Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Oct 15, 2024 | Image Generationmultimodal generation | CodeCode Available | 1 |
| SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution | Oct 8, 2024 | Super-ResolutionVideo Generation | CodeCode Available | 1 |
| Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality | Oct 7, 2024 | Video Generation | CodeCode Available | 1 |
| Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach | Oct 4, 2024 | Image GenerationImage to Video Generation | CodeCode Available | 1 |
| ECHOPulse: ECG controlled echocardio-grams video generation | Oct 4, 2024 | Video Generation | CodeCode Available | 1 |
| MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation | Oct 2, 2024 | Video Generation | CodeCode Available | 1 |
| Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining | Oct 1, 2024 | Atari Gamesmodel | CodeCode Available | 1 |
| A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation | Sep 26, 2024 | Inductive BiasVideo Generation | CodeCode Available | 1 |
| Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework | Sep 19, 2024 | Motion CompensationVideo Generation | CodeCode Available | 1 |
| AMG: Avatar Motion Guided Video Generation | Sep 2, 2024 | Video Generation | CodeCode Available | 1 |
| MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance | Jun 28, 2024 | Image GenerationVideo Generation | CodeCode Available | 1 |
| MVOC: a training-free multiple video object composition method with diffusion models | Jun 22, 2024 | Image to Video GenerationObject | CodeCode Available | 1 |
| SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset | Jun 20, 2024 | Safety AlignmentText-to-Video Generation | CodeCode Available | 1 |
| Fantastic Copyrighted Beasts and How (Not) to Generate Them | Jun 20, 2024 | Image GenerationVideo Generation | CodeCode Available | 1 |
| VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs | Jun 14, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation | Jun 12, 2024 | BenchmarkingImage Generation | CodeCode Available | 1 |
| Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion | Jun 9, 2024 | Autonomous DrivingObject | CodeCode Available | 1 |
| ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation | Jun 4, 2024 | QuantizationVideo Generation | CodeCode Available | 1 |
| EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing | Jun 2, 2024 | De-identificationPrivacy Preserving | CodeCode Available | 1 |
| EG4D: Explicit Generation of 4D Object without Score Distillation | May 28, 2024 | Dynamic ReconstructionVideo Generation | CodeCode Available | 1 |
| MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling | May 28, 2024 | Video Generation | CodeCode Available | 1 |
| ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance | May 27, 2024 | Diffusion PersonalizationVideo Generation | CodeCode Available | 1 |
| MotionCraft: Physics-based Zero-Shot Video Generation | May 22, 2024 | Image GenerationMissing Elements | CodeCode Available | 1 |
| OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models | May 21, 2024 | Video Generation | CodeCode Available | 1 |
| OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | May 10, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 1 |
| TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation | May 7, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 1 |
| FlexiFilm: Long Video Generation with Flexible Conditions | Apr 29, 2024 | Image GenerationVideo Generation | CodeCode Available | 1 |
| TAVGBench: Benchmarking Text to Audible-Video Generation | Apr 22, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 1 |
| StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN | Mar 21, 2024 | Unconditional Video GenerationVideo Generation | CodeCode Available | 1 |
| VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis | Mar 20, 2024 | Generative Temporal NursingText-to-Video Generation | CodeCode Available | 1 |
| AICL: Action In-Context Learning for Video Diffusion Model | Mar 18, 2024 | Action GenerationIn-Context Learning | CodeCode Available | 1 |
| SSM Meets Video Diffusion Models: Efficient Long-Term Video Generation with Structured State Spaces | Mar 12, 2024 | GPUImage Generation | CodeCode Available | 1 |
| Pix2Gif: Motion-Guided Diffusion for GIF Generation | Mar 7, 2024 | Video Generation | CodeCode Available | 1 |
| Constrained Synthesis with Projected Diffusion Models | Feb 5, 2024 | Motion SynthesisVideo Generation | CodeCode Available | 1 |
| Detecting AI-Generated Video via Frame Consistency | Feb 3, 2024 | Video Generation | CodeCode Available | 1 |
| DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations | Jan 23, 2024 | 3D Shape GenerationImage Generation | CodeCode Available | 1 |
| Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Jan 19, 2024 | 3D GenerationNeural Rendering | CodeCode Available | 1 |
| Vlogger: Make Your Dream A Vlog | Jan 17, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning | Jan 16, 2024 | Video Generation | CodeCode Available | 1 |
| Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | Jan 3, 2024 | Image AnimationVideo Editing | CodeCode Available | 1 |
| VideoStudio: Generating Consistent-Content and Multi-Scene Videos | Jan 2, 2024 | DescriptiveVideo Generation | CodeCode Available | 1 |