| Text-Animator: Controllable Visual Text Video Generation | Jun 25, 2024 | Text GenerationVideo Generation | —Unverified | 0 |
| MotionBooth: Motion-Aware Customized Text-to-Video Generation | Jun 25, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Video-Infinity: Distributed Long Video Generation | Jun 24, 2024 | GPUVideo Generation | —Unverified | 0 |
| Dreamitate: Real-World Visuomotor Policy Learning via Video Generation | Jun 24, 2024 | Video Generation | —Unverified | 0 |
| Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model | Jun 22, 2024 | AttributeImage to Video Generation | —Unverified | 0 |
| VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation | Jun 21, 2024 | Video GenerationVideo Quality Assessment | —Unverified | 0 |
| ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning | Jun 20, 2024 | GPUVideo Generation | CodeCode Available | 0 |
| Video Generation with Learned Action Prior | Jun 20, 2024 | Image GenerationImage Reconstruction | —Unverified | 0 |
| ARDuP: Active Region Video Diffusion for Universal Policies | Jun 19, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Splatter a Video: Video Gaussian Representation for Versatile Processing | Jun 19, 2024 | Depth EstimationDepth Prediction | —Unverified | 0 |
| NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation | Jun 17, 2024 | Knowledge DistillationNeRF | —Unverified | 0 |
| Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion | Jun 17, 2024 | Video Generation | CodeCode Available | 0 |
| Training-free Camera Control for Video Generation | Jun 14, 2024 | Data AugmentationVideo Generation | —Unverified | 0 |
| Vivid-ZOO: Multi-View Video Generation with Diffusion Model | Jun 12, 2024 | Video Generation | —Unverified | 0 |
| Hierarchical Patch Diffusion Models for High-Resolution Video Generation | Jun 12, 2024 | Video Generation | —Unverified | 0 |
| DiTFastAttn: Attention Compression for Diffusion Transformer Models | Jun 12, 2024 | 2kImage Generation | —Unverified | 0 |
| HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness | Jun 11, 2024 | ObjectVideo Editing | —Unverified | 0 |
| 4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models | Jun 11, 2024 | Scene GenerationVideo Generation | —Unverified | 0 |
| Visual Representation Learning with Stochastic Frame Prediction | Jun 11, 2024 | DecoderPose Tracking | —Unverified | 0 |
| AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation | Jun 11, 2024 | Audio GenerationVideo Generation | —Unverified | 0 |
| CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion | Jun 7, 2024 | SchedulingVideo Generation | —Unverified | 0 |
| Zero-Shot Video Editing through Adaptive Sliding Score Distillation | Jun 7, 2024 | DenoisingText-to-Video Generation | —Unverified | 0 |
| VideoPhy: Evaluating Physical Commonsense for Video Generation | Jun 5, 2024 | Video Generation | —Unverified | 0 |
| Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control | Jun 5, 2024 | Image AnimationVideo Generation | —Unverified | 0 |
| CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation | Jun 4, 2024 | Image to Video GenerationVideo Generation | —Unverified | 0 |
| V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation | Jun 4, 2024 | Video Generation | —Unverified | 0 |
| I4VGen: Image as Free Stepping Stone for Text-to-Video Generation | Jun 4, 2024 | DiversityImage Generation | —Unverified | 0 |
| Learning Temporally Consistent Video Depth from Video Diffusion Priors | Jun 3, 2024 | Depth EstimationNovel View Synthesis | —Unverified | 0 |
| Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation | Jun 3, 2024 | Autonomous DrivingVideo Generation | —Unverified | 0 |
| 4Diffusion: Multi-view Video Diffusion Model for 4D Generation | May 31, 2024 | NeRFVideo Generation | —Unverified | 0 |
| VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers | May 28, 2024 | DenoisingVideo Generation | —Unverified | 0 |
| MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation | May 28, 2024 | Video Generation | CodeCode Available | 0 |
| RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance | May 27, 2024 | Image GenerationVideo Generation | —Unverified | 0 |
| Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation | May 27, 2024 | Video Generation | —Unverified | 0 |
| Human4DiT: 360-degree Human Video Generation with 4D Diffusion Transformer | May 27, 2024 | Video Generation | —Unverified | 0 |
| Controllable Longer Image Animation with Diffusion Models | May 27, 2024 | Image Animationmotion prediction | —Unverified | 0 |
| Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control | May 27, 2024 | Scene GenerationVideo Generation | —Unverified | 0 |
| Towards Multi-Task Multi-Modal Models: A Video Generative Perspective | May 26, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation | May 26, 2024 | Video Generation | —Unverified | 0 |
| A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence | May 24, 2024 | Text GenerationVideo Generation | CodeCode Available | 0 |
| Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation | May 24, 2024 | Image GenerationMamba | —Unverified | 0 |
| PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control | May 23, 2024 | Video Generation | —Unverified | 0 |
| Fisher Flow Matching for Generative Modeling over Discrete Data | May 23, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes | May 23, 2024 | 3D GenerationAutonomous Driving | —Unverified | 0 |
| ReVideo: Remake a Video with Motion and Content Control | May 22, 2024 | Video EditingVideo Generation | —Unverified | 0 |
| DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control | May 21, 2024 | AttributeMotion Generation | —Unverified | 0 |
| CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers | May 21, 2024 | Image to Video GenerationVideo Generation | —Unverified | 0 |
| Dance Any Beat: Blending Beats with Visuals in Dance Video Generation | May 15, 2024 | Image to Video GenerationOptical Flow Estimation | —Unverified | 0 |
| The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective | May 13, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Reviewing Intelligent Cinematography: AI research for camera-based video production | May 8, 2024 | Camera Calibrationobject-detection | —Unverified | 0 |