| Generative Inbetweening through Frame-wise Conditions-Driven Video Generation | Dec 16, 2024 | Video Generation | CodeCode Available | 2 |
| Owl-1: Omni World Model for Consistent Long Video Generation | Dec 12, 2024 | Video Generation | CodeCode Available | 2 |
| Doe-1: Closed-Loop Autonomous Driving with Large World Model | Dec 12, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models | Dec 10, 2024 | Video Generation | CodeCode Available | 2 |
| Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model | Dec 6, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation | Dec 5, 2024 | Image ComprehensionRepresentation Learning | CodeCode Available | 2 |
| VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation | Dec 3, 2024 | Script GenerationVideo Generation | CodeCode Available | 2 |
| PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation | Nov 30, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 2 |
| Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints | Nov 26, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing | Nov 25, 2024 | DenoisingVideo Generation | CodeCode Available | 2 |
| MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation | Nov 22, 2024 | Video Generation | CodeCode Available | 2 |
| LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Oct 28, 2024 | Video GenerationVideo Reconstruction | CodeCode Available | 2 |
| SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Oct 16, 2024 | DenoisingVideo Generation | CodeCode Available | 2 |
| VideoAgent: Self-Improving Video Generation | Oct 14, 2024 | HallucinationVideo Generation | CodeCode Available | 2 |
| Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models | Oct 14, 2024 | 3D geometryDenoising | CodeCode Available | 2 |
| LVD-2M: A Long-take Video Dataset with Temporally Dense Captions | Oct 14, 2024 | Video CaptioningVideo Generation | CodeCode Available | 2 |
| Progressive Autoregressive Video Diffusion Models | Oct 10, 2024 | DenoisingVideo Denoising | CodeCode Available | 2 |
| TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation | Oct 8, 2024 | Video Generation | CodeCode Available | 2 |
| Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation | Oct 7, 2024 | Prompt EngineeringVideo Generation | CodeCode Available | 2 |
| SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Sep 10, 2024 | Video Generation | CodeCode Available | 2 |
| CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Aug 23, 2024 | DenoisingMotion Generation | CodeCode Available | 2 |
| Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos | Jul 23, 2024 | Image GenerationPoint Tracking | CodeCode Available | 2 |
| T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation | Jul 19, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 |
| IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation | Jul 15, 2024 | DenoisingDepth Estimation | CodeCode Available | 2 |
| Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers | Jun 25, 2024 | Image GenerationModel Compression | CodeCode Available | 2 |
| FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Jun 24, 2024 | Video Generation | CodeCode Available | 2 |
| ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models | Jun 16, 2024 | Video Generation | CodeCode Available | 2 |
| Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs | Jun 13, 2024 | BenchmarkingQuestion Answering | CodeCode Available | 2 |
| Compositional Video Generation as Flow Equalization | Jun 10, 2024 | Video EditingVideo Generation | CodeCode Available | 2 |
| SF-V: Single Forward Video Generation Model | Jun 6, 2024 | Denoisingmodel | CodeCode Available | 2 |
| GenAI Arena: An Open Evaluation Platform for Generative Models | Jun 6, 2024 | Image GenerationInstruction Following | CodeCode Available | 2 |
| ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation | Jun 3, 2024 | GPUVideo Generation | CodeCode Available | 2 |
| Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion | May 30, 2024 | Semantic CommunicationVideo Compression | CodeCode Available | 2 |
| Improving the Training of Rectified Flows | May 30, 2024 | Image GenerationKnowledge Distillation | CodeCode Available | 2 |
| DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark | May 30, 2024 | DeepFake DetectionMamba | CodeCode Available | 2 |
| Video Diffusion Models are Training-free Motion Interpreter and Controller | May 23, 2024 | Video Generation | CodeCode Available | 2 |
| Video Diffusion Models: A Survey | May 6, 2024 | SurveyText-to-Video Generation | CodeCode Available | 2 |
| TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models | Apr 25, 2024 | DenoisingImage to Video Generation | CodeCode Available | 2 |
| Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model | Apr 2, 2024 | Video Generation | CodeCode Available | 2 |
| Motion Inversion for Video Customization | Mar 29, 2024 | Video Generation | CodeCode Available | 2 |
| Adaptive Super Resolution For One-Shot Talking-Head Generation | Mar 23, 2024 | DecoderSuper-Resolution | CodeCode Available | 2 |
| VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models | Mar 10, 2024 | Copy DetectionImage Generation | CodeCode Available | 2 |
| VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Mar 8, 2024 | Video Generation | CodeCode Available | 2 |
| UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control | Mar 4, 2024 | DiversityVideo Generation | CodeCode Available | 2 |
| Contextualized Diffusion Models for Text-Guided Image and Video Generation | Feb 26, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation | Feb 16, 2024 | Video Generation | CodeCode Available | 2 |
| InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions | Feb 5, 2024 | Video Generation | CodeCode Available | 2 |
| AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI | Jan 3, 2024 | Video AlignmentVideo Generation | CodeCode Available | 2 |
| DreamGaussian4D: Generative 4D Gaussian Splatting | Dec 28, 2023 | Video Generation | CodeCode Available | 2 |
| I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models | Dec 27, 2023 | Video Generation | CodeCode Available | 2 |