| Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis | Aug 18, 2023 | Dynamic ReconstructionNovel View Synthesis | CodeCode Available | 4 |
| Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators | Mar 23, 2023 | Image GenerationText-to-Video Generation | CodeCode Available | 4 |
| FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution | Apr 9, 2025 | 2kDecision Making | CodeCode Available | 3 |
| JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing | Jan 3, 2025 | 3D ReconstructionFace Generation | CodeCode Available | 3 |
| DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation | Dec 24, 2024 | Video EditingVideo Generation | CodeCode Available | 3 |
| AutoVFX: Physically Realistic Video Editing from Natural Language Instructions | Nov 4, 2024 | Code GenerationVideo Editing | CodeCode Available | 3 |
| Movie Gen: A Cast of Media Foundation Models | Oct 17, 2024 | Audio GenerationVideo Editing | CodeCode Available | 3 |
| Diffusion Model-Based Video Editing: A Survey | Jun 26, 2024 | modelSurvey | CodeCode Available | 3 |
| A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Jun 20, 2024 | Video Editing | CodeCode Available | 3 |
| MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | May 30, 2024 | DenoisingGPU | CodeCode Available | 3 |