| VidToMe: Video Token Merging for Zero-Shot Video Editing | Dec 17, 2023 | Video EditingVideo Generation | CodeCode Available | 2 |
| FreeInit: Bridging Initialization Gap in Video Diffusion Models | Dec 12, 2023 | DenoisingText-to-Video Generation | CodeCode Available | 2 |
| Kandinsky 3.0 Technical Report | Dec 6, 2023 | Image GenerationImage to Video Generation | CodeCode Available | 2 |
| AnimateZero: Video Diffusion Models are Zero-Shot Image Animators | Dec 6, 2023 | Image AnimationVideo Generation | CodeCode Available | 2 |
| TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models | Dec 1, 2023 | Image ClassificationMulti-Object Tracking | CodeCode Available | 2 |
| StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter | Dec 1, 2023 | DisentanglementText-to-Video Generation | CodeCode Available | 2 |
| Panacea: Panoramic and Controllable Video Generation for Autonomous Driving | Nov 28, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| AnimateAnything: Fine-Grained Open Domain Image Animation with Motion Guidance | Nov 21, 2023 | Image AnimationImage to Video Generation | CodeCode Available | 2 |
| SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction | Oct 31, 2023 | PredictionSemantic Similarity | CodeCode Available | 2 |
| LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation | Oct 16, 2023 | GPUImage Animation | CodeCode Available | 2 |
| DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model | Oct 11, 2023 | Autonomous DrivingImage Generation | CodeCode Available | 2 |
| DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving | Sep 18, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory | Aug 16, 2023 | Trajectory ModelingVideo Generation | CodeCode Available | 2 |
| Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation | Jul 19, 2023 | Talking Head GenerationVideo Generation | CodeCode Available | 2 |
| Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation | Jul 13, 2023 | RetrievalVideo Generation | CodeCode Available | 2 |
| Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising | May 29, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning | May 23, 2023 | Image GenerationOptical Flow Estimation | CodeCode Available | 2 |
| ControlVideo: Training-free Controllable Text-to-Video Generation | May 22, 2023 | Image GenerationText-to-Video Generation | CodeCode Available | 2 |
| VDT: General-purpose Video Diffusion Transformers via Mask Modeling | May 22, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |
| DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation | May 10, 2023 | 3D geometryGenerative Adversarial Network | CodeCode Available | 2 |
| StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video | May 1, 2023 | Face ReenactmentTranslation | CodeCode Available | 2 |
| Text2Performer: Text-Driven Human Video Generation | Apr 17, 2023 | Video Generation | CodeCode Available | 2 |
| CelebV-Text: A Large-Scale Facial Text-Video Dataset | Mar 26, 2023 | Text GenerationText-to-Video Generation | CodeCode Available | 2 |
| Conditional Image-to-Video Generation with Latent Flow Diffusion Models | Mar 24, 2023 | Image to Video GenerationMotion Generation | CodeCode Available | 2 |
| Blind Video Deflickering by Neural Filtering with a Flawed Atlas | Mar 14, 2023 | Video GenerationVideo Temporal Consistency | CodeCode Available | 2 |