| Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model | Feb 14, 2025 | Video GenerationVideo Reconstruction | CodeCode Available | 7 | 5 |
| VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training | Mar 23, 2022 | 4kAction Classification | CodeCode Available | 3 | 5 |
| Motion Representations for Articulated Animation | Apr 22, 2021 | ObjectVideo Reconstruction | CodeCode Available | 3 | 5 |
| Image and Video Tokenization with Binary Spherical Quantization | Jun 11, 2024 | DecoderImage Generation | CodeCode Available | 3 | 5 |
| First Order Motion Model for Image Animation | Feb 29, 2020 | Image Animationmodel | CodeCode Available | 3 | 5 |
| Seeing World Dynamics in a Nutshell | Feb 5, 2025 | Video Reconstruction | CodeCode Available | 2 | 5 |
| LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models | Mar 18, 2025 | compressed sensingVideo Generation | CodeCode Available | 2 | 5 |
| LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Oct 28, 2024 | Video GenerationVideo Reconstruction | CodeCode Available | 2 | 5 |
| NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Oct 25, 2024 | SSIMVideo Reconstruction | CodeCode Available | 2 | 5 |
| Cascaded Temporal Updating Network for Efficient Video Super-Resolution | Aug 26, 2024 | Super-ResolutionVideo Reconstruction | CodeCode Available | 1 | 5 |