| Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k | Mar 12, 2025 | Video Generation | CodeCode Available | 13 |
| Open-Sora: Democratizing Efficient Video Production for All | Dec 29, 2024 | AllImage Generation | CodeCode Available | 13 |
| Wan: Open and Advanced Large-Scale Video Generative Models | Mar 26, 2025 | Video EditingVideo Generation | CodeCode Available | 11 |
| HunyuanVideo: A Systematic Framework For Large Video Generative Models | Dec 3, 2024 | Video AlignmentVideo Generation | CodeCode Available | 11 |
| Open-Sora Plan: Open-Source Large Video Generation Model | Nov 28, 2024 | Video Generation | CodeCode Available | 11 |
| CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer | Aug 12, 2024 | Text-to-Video GenerationVideo Alignment | CodeCode Available | 11 |
| LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control | Jul 3, 2024 | Computational EfficiencyFace Reenactment | CodeCode Available | 11 |
| SkyReels-V2: Infinite-length Film Generative Model | Apr 17, 2025 | Large Language Modelmodel | CodeCode Available | 9 |
| LTX-Video: Realtime Video Latent Diffusion | Dec 30, 2024 | DenoisingGPU | CodeCode Available | 9 |
| MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling | Oct 14, 2024 | Audio-Visual SynchronizationGPU | CodeCode Available | 9 |
| StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation | May 2, 2024 | motion predictionStory Generation | CodeCode Available | 9 |
| VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | Jan 17, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 9 |
| Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation | May 28, 2025 | Human AnimationInstruction Following | CodeCode Available | 7 |
| SageAttention2++: A More Efficient Implementation of SageAttention2 | May 27, 2025 | QuantizationVideo Generation | CodeCode Available | 7 |
| MAGI-1: Autoregressive Video Generation at Scale | May 19, 2025 | Video Generation | CodeCode Available | 7 |
| Aligning Anime Video Generation with Human Feedback | Apr 14, 2025 | Video Generation | CodeCode Available | 7 |
| VACE: All-in-One Video Creation and Editing | Mar 10, 2025 | AllHuman-Domain Subject-to-Video | CodeCode Available | 7 |
| Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model | Feb 14, 2025 | Video GenerationVideo Reconstruction | CodeCode Available | 7 |
| Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile | Feb 10, 2025 | Video Generation | CodeCode Available | 7 |
| Goku: Flow Based Video Generative Foundation Models | Feb 7, 2025 | Image GenerationText to Image Generation | CodeCode Available | 7 |
| Fast Video Generation with Sliding Tile Attention | Feb 6, 2025 | Video Generation | CodeCode Available | 7 |
| AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era | Dec 13, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 7 |
| SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization | Nov 17, 2024 | Image GenerationQuantization | CodeCode Available | 7 |
| EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation | Nov 15, 2024 | Audio-Driven Body AnimationHuman Animation | CodeCode Available | 7 |
| Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation | Oct 10, 2024 | 4kImage Animation | CodeCode Available | 7 |
| Pyramidal Flow Matching for Efficient Video Generative Modeling | Oct 8, 2024 | GPUText-to-Video Generation | CodeCode Available | 7 |
| SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration | Oct 3, 2024 | Image GenerationQuantization | CodeCode Available | 7 |
| Real-Time Video Generation with Pyramid Attention Broadcast | Aug 22, 2024 | Video Generation | CodeCode Available | 7 |
| EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture | May 29, 2024 | Image GenerationVideo Generation | CodeCode Available | 7 |
| Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability | May 27, 2024 | Autonomous DrivingVideo Generation | CodeCode Available | 7 |
| Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance | Mar 21, 2024 | Animated GIF GenerationImage Animation | CodeCode Available | 7 |
| DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers | Mar 15, 2024 | Text GenerationVideo Generation | CodeCode Available | 7 |
| DragAnything: Motion Control for Anything using Entity Representation | Mar 12, 2024 | ObjectVideo Generation | CodeCode Available | 7 |
| SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models | Nov 28, 2023 | Video Generation | CodeCode Available | 6 |
| CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers | May 29, 2022 | Text-to-Video GenerationVideo Generation | CodeCode Available | 6 |
| Matrix-Game: Interactive World Foundation Model | Jun 23, 2025 | Minecraftmodel | CodeCode Available | 5 |
| Show-o2: Improved Native Unified Multimodal Models | Jun 18, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation | Jun 2, 2025 | Data AugmentationHuman Animation | CodeCode Available | 5 |
| DanceGRPO: Unleashing GRPO on Visual Generation | May 12, 2025 | Denoisingreinforcement-learning | CodeCode Available | 5 |
| HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation | May 7, 2025 | Human-Domain Subject-to-VideoSingle-Domain Subject-to-Video | CodeCode Available | 5 |
| VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness | Mar 27, 2025 | Anomaly DetectionVideo Generation | CodeCode Available | 5 |
| GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control | Mar 5, 2025 | Novel View SynthesisVideo Generation | CodeCode Available | 5 |
| Phantom: Subject-consistent video generation via cross-modal alignment | Feb 16, 2025 | cross-modal alignmentHuman-Domain Subject-to-Video | CodeCode Available | 5 |
| TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models | Feb 10, 2025 | 3D Generation3D Reconstruction | CodeCode Available | 5 |
| Video Depth Anything: Consistent Depth Estimation for Super-Long Videos | Jan 21, 2025 | Computational EfficiencyDepth Estimation | CodeCode Available | 5 |
| StableAnimator: High-Quality Identity-Preserving Human Image Animation | Nov 26, 2024 | DenoisingFace Reenactment | CodeCode Available | 5 |
| VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models | Nov 20, 2024 | BenchmarkingImage Generation | CodeCode Available | 5 |
| Allegro: Open the Black Box of Commercial-Level Video Generation Model | Oct 20, 2024 | Video Generation | CodeCode Available | 5 |
| DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos | Sep 3, 2024 | Depth EstimationDiversity | CodeCode Available | 5 |
| ControlNeXt: Powerful and Efficient Control for Image and Video Generation | Aug 12, 2024 | Video Generation | CodeCode Available | 5 |