| DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation | Jan 1, 2024 | Video Generation | —Unverified | 0 |
| PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought | Jan 1, 2024 | Computational EfficiencyPrompt Engineering | —Unverified | 0 |
| SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model | Jan 1, 2024 | Video Generation | —Unverified | 0 |
| LAMP: Learn A Motion Pattern for Few-Shot Video Generation | Jan 1, 2024 | GPUImage Animation | —Unverified | 0 |
| On the Content Bias in Frechet Video Distance | Jan 1, 2024 | Video Generation | —Unverified | 0 |
| TrailBlazer: Trajectory Control for Diffusion-Based Video Generation | Dec 31, 2023 | Video Generation | CodeCode Available | 1 |
| FlashVideo: A Framework for Swift Inference in Text-to-Video Generation | Dec 30, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| DreamGaussian4D: Generative 4D Gaussian Splatting | Dec 28, 2023 | Video Generation | CodeCode Available | 2 |
| I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models | Dec 27, 2023 | Video Generation | CodeCode Available | 2 |
| A Recipe for Scaling up Text-to-Video Generation with Text-free Videos | Dec 25, 2023 | Image GenerationText to Image Generation | —Unverified | 0 |
| Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models | Dec 21, 2023 | Synthetic Data GenerationVideo Generation | —Unverified | 0 |
| Free-Editor: Zero-shot Text-driven 3D Scene Editing | Dec 21, 2023 | 3D scene EditingStyle Transfer | CodeCode Available | 1 |
| VideoPoet: A Large Language Model for Zero-Shot Video Generation | Dec 21, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| InstructVideo: Instructing Video Diffusion Models with Human Feedback | Dec 19, 2023 | Video Generation | —Unverified | 0 |
| Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method | Dec 19, 2023 | Video Generation | CodeCode Available | 1 |
| VidToMe: Video Token Merging for Zero-Shot Video Editing | Dec 17, 2023 | Video EditingVideo Generation | CodeCode Available | 2 |
| VideoLCM: Video Latent Consistency Model | Dec 14, 2023 | Computational EfficiencyImage Generation | —Unverified | 0 |
| PEEKABOO: Interactive Video Generation via Masked-Diffusion | Dec 12, 2023 | Text-to-Video GenerationVideo Generation | CodeCode Available | 1 |
| FreeInit: Bridging Initialization Gap in Video Diffusion Models | Dec 12, 2023 | DenoisingText-to-Video Generation | CodeCode Available | 2 |
| Photorealistic Video Generation with Diffusion Models | Dec 11, 2023 | Super-ResolutionText-to-Video Generation | —Unverified | 0 |
| MotionCrafter: One-Shot Motion Customization of Diffusion Models | Dec 8, 2023 | DisentanglementMotion Disentanglement | CodeCode Available | 1 |
| DreaMoving: A Human Video Generation Framework based on Diffusion Models | Dec 8, 2023 | Video Generation | —Unverified | 0 |
| NewMove: Customizing text-to-video models with novel motions | Dec 7, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| DreamVideo: Composing Your Dream Videos with Customized Subject and Motion | Dec 7, 2023 | Image GenerationVideo Generation | —Unverified | 0 |
| GenTron: Diffusion Transformers for Image and Video Generation | Dec 7, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Dec 7, 2023 | Spatial ReasoningText-to-Video Generation | —Unverified | 0 |
| MEVG: Multi-event Video Generation with Text-to-Video Models | Dec 7, 2023 | Video Generation | —Unverified | 0 |
| GenDeF: Learning Generative Deformation Field for Video Generation | Dec 7, 2023 | DisentanglementVideo Editing | —Unverified | 0 |
| AnimateZero: Video Diffusion Models are Zero-Shot Image Animators | Dec 6, 2023 | Image AnimationVideo Generation | CodeCode Available | 2 |
| FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability | Dec 6, 2023 | Face ModelVideo Generation | —Unverified | 0 |
| Kandinsky 3.0 Technical Report | Dec 6, 2023 | Image GenerationImage to Video Generation | CodeCode Available | 2 |
| MotionCtrl: A Unified and Flexible Motion Controller for Video Generation | Dec 6, 2023 | ObjectVideo Generation | CodeCode Available | 3 |
| MagicStick: Controllable Video Editing via Control Handle Transformations | Dec 5, 2023 | Video EditingVideo Generation | CodeCode Available | 1 |
| DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance | Dec 5, 2023 | Image to Video GenerationVideo Generation | —Unverified | 0 |
| BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models | Dec 5, 2023 | Image GenerationModel Selection | CodeCode Available | 1 |
| WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation | Dec 5, 2023 | Autonomous DrivingDiversity | CodeCode Available | 1 |
| LivePhoto: Real Image Animation with Text-guided Motion Control | Dec 5, 2023 | Image AnimationText-to-Video Generation | —Unverified | 0 |
| Fine-grained Controllable Video Generation via Object Appearance and Context | Dec 5, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| DragVideo: Interactive Drag-style Video Editing | Dec 3, 2023 | Video EditingVideo Generation | CodeCode Available | 1 |
| Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models | Dec 3, 2023 | Image GenerationText to Image Generation | —Unverified | 0 |
| VideoBooth: Diffusion-based Video Generation with Image Prompts | Dec 1, 2023 | Video Generation | —Unverified | 0 |
| VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models | Dec 1, 2023 | Video EditingVideo Generation | CodeCode Available | 1 |
| TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models | Dec 1, 2023 | Image ClassificationMulti-Object Tracking | CodeCode Available | 2 |
| StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter | Dec 1, 2023 | DisentanglementText-to-Video Generation | CodeCode Available | 2 |
| VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Nov 30, 2023 | Semantic SegmentationVideo Editing | —Unverified | 0 |
| MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation | Nov 30, 2023 | Image GenerationText to Image Generation | —Unverified | 0 |
| ARTV: Auto-Regressive Text-to-Video Generation with Diffusion Models | Nov 30, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| VBench: Comprehensive Benchmark Suite for Video Generative Models | Nov 29, 2023 | Image GenerationVideo Generation | CodeCode Available | 3 |
| MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing | Nov 29, 2023 | DenoisingImage to Video Generation | CodeCode Available | 1 |
| Panacea: Panoramic and Controllable Video Generation for Autonomous Driving | Nov 28, 2023 | Autonomous DrivingVideo Generation | CodeCode Available | 2 |