| Pix2Gif: Motion-Guided Diffusion for GIF Generation | Mar 7, 2024 | Video Generation | CodeCode Available | 1 |
| A spatiotemporal style transfer algorithm for dynamic visual stimulus generation | Mar 7, 2024 | Image GenerationObject Recognition | —Unverified | 0 |
| Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation | Mar 5, 2024 | DenoisingImage Animation | —Unverified | 0 |
| AtomoVideo: High Fidelity Image-to-Video Generation | Mar 4, 2024 | Image GenerationImage to Video Generation | —Unverified | 0 |
| UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control | Mar 4, 2024 | DiversityVideo Generation | CodeCode Available | 2 |
| Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Mar 1, 2024 | Objectobject-detection | —Unverified | 0 |
| Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Feb 29, 2024 | RetrievalText Retrieval | CodeCode Available | 4 |
| Context-aware Talking Face Video Generation | Feb 28, 2024 | Video GenerationVideo Synchronization | —Unverified | 0 |
| Sora Generates Videos with Stunning Geometrical Consistency | Feb 27, 2024 | 3D ReconstructionVideo Generation | —Unverified | 0 |
| Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models | Feb 27, 2024 | MarketingVideo Generation | CodeCode Available | 4 |
| Video as the New Language for Real-World Decision Making | Feb 27, 2024 | Decision MakingIn-Context Learning | —Unverified | 0 |
| EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Feb 27, 2024 | Video Generation | —Unverified | 0 |
| Contextualized Diffusion Models for Text-Guided Image and Video Generation | Feb 26, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Feb 22, 2024 | Video Generation | —Unverified | 0 |
| Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Feb 22, 2024 | Image GenerationText-to-Video Generation | —Unverified | 0 |
| Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation | Feb 21, 2024 | Video GenerationVideo Reconstruction | —Unverified | 0 |
| VGMShield: Mitigating Misuse of Video Generative Models | Feb 20, 2024 | Video Generation | CodeCode Available | 0 |
| Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation | Feb 16, 2024 | Video Generation | CodeCode Available | 2 |
| Magic-Me: Identity-Specific Video Customized Diffusion | Feb 14, 2024 | Image GenerationText to Image Generation | CodeCode Available | 3 |
| Denoising Diffusion Probabilistic Models in Six Simple Steps | Feb 6, 2024 | DenoisingVideo Generation | —Unverified | 0 |
| ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation | Feb 6, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 3 |
| Constrained Synthesis with Projected Diffusion Models | Feb 5, 2024 | Motion SynthesisVideo Generation | CodeCode Available | 1 |
| Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Feb 5, 2024 | ObjectVideo Generation | —Unverified | 0 |
| Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization | Feb 5, 2024 | Science Question AnsweringText-to-Video Generation | CodeCode Available | 4 |
| InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions | Feb 5, 2024 | Video Generation | CodeCode Available | 2 |
| Detecting AI-Generated Video via Frame Consistency | Feb 3, 2024 | Video Generation | CodeCode Available | 1 |
| NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Feb 2, 2024 | Contrastive LearningSSIM | —Unverified | 0 |
| AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data | Feb 1, 2024 | Conditional Image GenerationDenoising | CodeCode Available | 4 |
| A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming | Jan 30, 2024 | Video GenerationVideo Understanding | —Unverified | 0 |
| Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling | Jan 29, 2024 | Image to Video GenerationVideo Generation | —Unverified | 0 |
| DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations | Jan 23, 2024 | 3D Shape GenerationImage Generation | CodeCode Available | 1 |
| Lumiere: A Space-Time Diffusion Model for Video Generation | Jan 23, 2024 | Super-ResolutionText-to-Video Generation | CodeCode Available | 3 |
| Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Jan 19, 2024 | 3D GenerationNeural Rendering | CodeCode Available | 1 |
| Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution | Jan 18, 2024 | Super-ResolutionVideo Generation | —Unverified | 0 |
| CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects | Jan 18, 2024 | ObjectText-to-Video Generation | —Unverified | 0 |
| WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens | Jan 18, 2024 | Video EditingVideo Generation | —Unverified | 0 |
| Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation | Jan 18, 2024 | DenoisingPosition | —Unverified | 0 |
| Vlogger: Make Your Dream A Vlog | Jan 17, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | Jan 17, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 9 |
| UniVG: Towards UNIfied-modal Video Generation | Jan 17, 2024 | Video Generation | —Unverified | 0 |
| E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning | Jan 16, 2024 | Video Generation | CodeCode Available | 1 |
| Towards A Better Metric for Text-to-Video Generation | Jan 15, 2024 | Mixture-of-ExpertsText-to-Video Generation | —Unverified | 0 |
| 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model | Jan 12, 2024 | Video Generation | —Unverified | 0 |
| RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks | Jan 11, 2024 | Generative Adversarial NetworkOptical Flow Estimation | —Unverified | 0 |
| MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation | Jan 9, 2024 | MORPHVideo Generation | —Unverified | 0 |
| Neural Rendering and Its Hardware Acceleration: A Review | Jan 6, 2024 | 3D ReconstructionDeep Learning | —Unverified | 0 |
| Latte: Latent Diffusion Transformer for Video Generation | Jan 5, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 5 |
| Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions | Jan 3, 2024 | Image AnimationVideo Editing | CodeCode Available | 1 |
| AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI | Jan 3, 2024 | Video AlignmentVideo Generation | CodeCode Available | 2 |
| VideoStudio: Generating Consistent-Content and Multi-Scene Videos | Jan 2, 2024 | DescriptiveVideo Generation | CodeCode Available | 1 |