| CelebV-Text: A Large-Scale Facial Text-Video Dataset | Mar 26, 2023 | Text GenerationText-to-Video Generation | CodeCode Available | 2 |
| Conditional Image-to-Video Generation with Latent Flow Diffusion Models | Mar 24, 2023 | Image to Video GenerationMotion Generation | CodeCode Available | 2 |
| Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators | Mar 23, 2023 | Image GenerationText-to-Video Generation | CodeCode Available | 4 |
| Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis | Mar 22, 2023 | Image GenerationVideo Generation | CodeCode Available | 1 |
| NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation | Mar 22, 2023 | Video Generation | —Unverified | 0 |
| Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers | Mar 20, 2023 | Video Generation | CodeCode Available | 1 |
| VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation | Mar 15, 2023 | Code GenerationDenoising | CodeCode Available | 4 |
| Blind Video Deflickering by Neural Filtering with a Flawed Atlas | Mar 14, 2023 | Video GenerationVideo Temporal Consistency | CodeCode Available | 2 |
| Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE | Mar 9, 2023 | Video Generation | —Unverified | 0 |
| Video-P2P: Video Editing with Cross-attention Control | Mar 8, 2023 | Image GenerationVideo Editing | CodeCode Available | 2 |
| Video-P2P: Video Editing with Cross-attention Control | Mar 8, 2023 | Image GenerationVideo Editing | CodeCode Available | 2 |
| MOSO: Decomposing MOtion, Scene and Object for Video Prediction | Mar 7, 2023 | ObjectUnconditional Video Generation | CodeCode Available | 1 |
| MotionVideoGAN: A Novel Video Generator Based on the Motion Space Learned from Image Pairs | Mar 6, 2023 | Motion GenerationUnconditional Video Generation | CodeCode Available | 0 |
| Consistency Models | Mar 2, 2023 | ColorizationImage Generation | CodeCode Available | 5 |
| One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2 | Feb 15, 2023 | AttributeDisentanglement | —Unverified | 0 |
| Video Probabilistic Diffusion Models in Projected Latent Space | Feb 15, 2023 | Video Generation | CodeCode Available | 2 |
| Structure and Content-Guided Video Synthesis with Diffusion Models | Feb 6, 2023 | DisentanglementText-to-Video Generation | —Unverified | 0 |
| Generative Diffusion Models on Graphs: Methods and Applications | Feb 6, 2023 | DenoisingGraph Generation | CodeCode Available | 2 |
| SceneScape: Text-Driven Consistent Scene Generation | Feb 2, 2023 | Depth EstimationDepth Prediction | —Unverified | 0 |
| Dreamix: Video Diffusion Models are General Video Editors | Feb 2, 2023 | Image AnimationImage to Video Generation | —Unverified | 0 |
| Learning Universal Policies via Text-Guided Video Generation | Jan 31, 2023 | Decision MakingImage Generation | —Unverified | 0 |
| Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models | Jan 30, 2023 | Audio GenerationText-to-Video Generation | CodeCode Available | 2 |
| Regeneration Learning: A Learning Paradigm for Data Generation | Jan 21, 2023 | Image GenerationRepresentation Learning | —Unverified | 0 |
| Time-Conditioned Generative Modeling of Object-Centric Representations for Video Decomposition and Prediction | Jan 21, 2023 | DisentanglementGaussian Processes | CodeCode Available | 0 |
| Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation | Jan 6, 2023 | Face GenerationTalking Face Generation | —Unverified | 0 |
| SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning | Jan 1, 2023 | Image GenerationVideo Generation | —Unverified | 0 |
| Recovering Surveillance Video Using RF Cues | Dec 27, 2022 | Video Generation | —Unverified | 0 |
| Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation | Dec 22, 2022 | Style TransferText-to-Video Generation | CodeCode Available | 4 |
| Scalable Adaptive Computation for Iterative Generation | Dec 22, 2022 | Image GenerationVideo Generation | CodeCode Available | 0 |
| SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers | Dec 20, 2022 | DecoderDenoising | CodeCode Available | 1 |
| MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation | Dec 19, 2022 | cross-modal alignmentDenoising | CodeCode Available | 2 |
| Towards Smooth Video Composition | Dec 14, 2022 | Image Generationsingle-image-generation | CodeCode Available | 1 |
| PV3D: A 3D Generative Model for Portrait Video Generation | Dec 13, 2022 | Video Generation | —Unverified | 0 |
| MAGVIT: Masked Generative Video Transformer | Dec 10, 2022 | Multi-Task LearningText-to-Video Generation | CodeCode Available | 2 |
| Neural Cell Video Synthesis via Optical-Flow Diffusion | Dec 6, 2022 | Cultural Vocal Bursts Intensity PredictionDenoising | —Unverified | 0 |
| Audio-Driven Co-Speech Gesture Video Generation | Dec 5, 2022 | Video Generation | —Unverified | 0 |
| VIDM: Video Implicit Diffusion Models | Dec 1, 2022 | Generative Adversarial NetworkVideo Generation | CodeCode Available | 1 |
| VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild | Nov 27, 2022 | Video EditingVideo Generation | CodeCode Available | 5 |
| 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models | Nov 25, 2022 | DenoisingNeRF | —Unverified | 0 |
| Latent Video Diffusion Models for High-Fidelity Long Video Generation | Nov 23, 2022 | DenoisingImage Generation | CodeCode Available | 2 |
| Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation | Nov 23, 2022 | Text-to-Video GenerationVideo Generation | CodeCode Available | 1 |
| SinFusion: Training Diffusion Models on a Single Image or Video | Nov 21, 2022 | DiversityImage Manipulation | CodeCode Available | 1 |
| MagicVideo: Efficient Video Generation With Latent Diffusion Models | Nov 20, 2022 | GPUText-to-Video Generation | —Unverified | 0 |
| Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives | Nov 9, 2022 | DisentanglementVideo Generation | CodeCode Available | 2 |
| INR-V: A Continuous Representation Space for Video-based Generative Tasks | Oct 29, 2022 | Video GenerationVideo Inpainting | CodeCode Available | 1 |
| Facial Expression Video Generation Based-On Spatio-temporal Convolutional GAN: FEV-GAN | Oct 20, 2022 | Facial expression generationVideo Generation | —Unverified | 0 |
| DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction | Oct 10, 2022 | Optical Flow EstimationVideo Frame Interpolation | —Unverified | 0 |
| KP-RNN: A Deep Learning Pipeline for Human Motion Prediction and Synthesis of Performance Art | Oct 9, 2022 | Human motion predictionImage-to-Image Translation | CodeCode Available | 0 |
| See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction | Oct 7, 2022 | PredictionVideo Generation | —Unverified | 0 |
| Text-driven Video Prediction | Oct 6, 2022 | Causal InferencePrediction | —Unverified | 0 |