SOTAVerified

Image to Video Generation

Image to Video Generation refers to the task of generating a sequence of video frames based on a single still image or a set of still images. The goal is to produce a video that is coherent and consistent in terms of appearance, motion, and style, while also being temporally consistent, meaning that the generated video should look like a coherent sequence of frames that are temporally ordered. This task is typically tackled using deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), that are trained on large datasets of videos. The models learn to generate plausible video frames that are conditioned on the input image, as well as on any other auxiliary information, such as a sound or text track.

Papers

Showing 4150 of 85 papers

TitleStatusHype
Towards Physically Plausible Video Generation via VLM Planning0
LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer0
Dreamix: Video Diffusion Models are General Video Editors0
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance0
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models0
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale0
MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition0
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation0
Decouple Content and Motion for Conditional Image-to-Video Generation0
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent0
Show:102550
← PrevPage 5 of 9Next →

No leaderboard results yet.