| Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning | May 23, 2023 | Image GenerationOptical Flow Estimation | CodeCode Available | 2 |
| Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers | May 20, 2025 | GPUVideo Generation | CodeCode Available | 2 |
| I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models | Dec 27, 2023 | Video Generation | CodeCode Available | 2 |
| Generative Inbetweening through Frame-wise Conditions-Driven Video Generation | Dec 16, 2024 | Video Generation | CodeCode Available | 2 |
| Generative Diffusion Models on Graphs: Methods and Applications | Feb 6, 2023 | DenoisingGraph Generation | CodeCode Available | 2 |
| Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising | May 29, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| SF-V: Single Forward Video Generation Model | Jun 6, 2024 | Denoisingmodel | CodeCode Available | 2 |
| Generating Long Videos of Dynamic Scenes | Jun 7, 2022 | MORPHVideo Generation | CodeCode Available | 2 |
| SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Sep 10, 2024 | Video Generation | CodeCode Available | 2 |
| Conditional Image-to-Video Generation with Latent Flow Diffusion Models | Mar 24, 2023 | Image to Video GenerationMotion Generation | CodeCode Available | 2 |
| Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints | Nov 26, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation | Apr 30, 2025 | Depth EstimationScene Generation | CodeCode Available | 2 |
| GenAI Arena: An Open Evaluation Platform for Generative Models | Jun 6, 2024 | Image GenerationInstruction Following | CodeCode Available | 2 |
| SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Oct 31, 2023 | PredictionSemantic Similarity | CodeCode Available | 2 |
| RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements | Apr 11, 2025 | Video Generation | CodeCode Available | 2 |
| Concat-ID: Towards Universal Identity-Preserving Video Synthesis | Mar 18, 2025 | Human-Domain Subject-to-VideoVideo Generation | CodeCode Available | 2 |
| Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos | Jul 23, 2024 | Image GenerationPoint Tracking | CodeCode Available | 2 |
| I^2-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting | Jul 12, 2025 | Autonomous DrivingComputational Efficiency | CodeCode Available | 2 |
| FreeInit: Bridging Initialization Gap in Video Diffusion Models | Dec 12, 2023 | DenoisingText-to-Video Generation | CodeCode Available | 2 |
| Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers | Jun 25, 2024 | Image GenerationModel Compression | CodeCode Available | 2 |
| Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model | Apr 2, 2024 | Video Generation | CodeCode Available | 2 |
| IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation | Jul 15, 2024 | DenoisingDepth Estimation | CodeCode Available | 2 |
| Compositional Video Generation as Flow Equalization | Jun 10, 2024 | Video EditingVideo Generation | CodeCode Available | 2 |
| Depth-Aware Generative Adversarial Network for Talking Head Video Generation | Mar 13, 2022 | 3D geometryGenerative Adversarial Network | CodeCode Available | 2 |
| FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation | Jun 10, 2025 | Image-text RetrievalQuestion Answering | CodeCode Available | 2 |
| AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance | Nov 21, 2023 | Image AnimationImage to Video Generation | CodeCode Available | 2 |
| Collaborative Neural Rendering using Anime Character Sheets | Jul 12, 2022 | Image GenerationImage to 3D | CodeCode Available | 2 |
| AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI | Jan 3, 2024 | Video AlignmentVideo Generation | CodeCode Available | 2 |
| Kandinsky 3.0 Technical Report | Dec 6, 2023 | Image GenerationImage to Video Generation | CodeCode Available | 2 |
| CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Aug 23, 2024 | DenoisingMotion Generation | CodeCode Available | 2 |
| Progressive Autoregressive Video Diffusion Models | Oct 10, 2024 | DenoisingVideo Denoising | CodeCode Available | 2 |
| PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop | Mar 12, 2025 | DiagnosticVideo Generation | CodeCode Available | 2 |
| PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation | Nov 30, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 2 |
| LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models | Mar 18, 2025 | compressed sensingVideo Generation | CodeCode Available | 2 |
| PresentAgent: Multimodal Agent for Presentation Video Generation | Jul 5, 2025 | text-to-speechText to Speech | CodeCode Available | 2 |
| Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion | May 30, 2024 | Semantic CommunicationVideo Compression | CodeCode Available | 2 |
| Owl-1: Omni World Model for Consistent Long Video Generation | Dec 12, 2024 | Video Generation | CodeCode Available | 2 |
| Panacea: Panoramic and Controllable Video Generation for Autonomous Driving | Nov 28, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| ORV: 4D Occupancy-centric Robot Video Generation | Jun 3, 2025 | Video Generation | CodeCode Available | 2 |
| Phenaki: Variable Length Video Generation From Open Domain Textual Description | Oct 5, 2022 | DecoderVideo Generation | CodeCode Available | 2 |
| Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models | Dec 10, 2024 | Video Generation | CodeCode Available | 2 |
| CelebV-Text: A Large-Scale Facial Text-Video Dataset | Mar 26, 2023 | Text GenerationText-to-Video Generation | CodeCode Available | 2 |
| CelebV-HQ: A Large-Scale Video Facial Attributes Dataset | Jul 25, 2022 | AttributeDiversity | CodeCode Available | 2 |
| FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Jun 24, 2024 | Video Generation | CodeCode Available | 2 |
| SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Oct 16, 2024 | DenoisingVideo Generation | CodeCode Available | 2 |
| AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion | Mar 10, 2025 | Video Generation | CodeCode Available | 2 |
| Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs | Jun 13, 2024 | BenchmarkingQuestion Answering | CodeCode Available | 2 |
| Neighboring Autoregressive Modeling for Efficient Visual Generation | Mar 12, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation | Mar 27, 2025 | DenoisingHuman Animation | CodeCode Available | 2 |
| Omni-Video: Democratizing Unified Video Understanding and Generation | Jul 8, 2025 | Video GenerationVideo Understanding | CodeCode Available | 2 |