SOTAVerified

Image to Video Generation

Image to Video Generation refers to the task of generating a sequence of video frames based on a single still image or a set of still images. The goal is to produce a video that is coherent and consistent in terms of appearance, motion, and style, while also being temporally consistent, meaning that the generated video should look like a coherent sequence of frames that are temporally ordered. This task is typically tackled using deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), that are trained on large datasets of videos. The models learn to generate plausible video frames that are conditioned on the input image, as well as on any other auxiliary information, such as a sound or text track.

Papers

Showing 6170 of 85 papers

TitleStatusHype
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale0
FrameBridge: Improving Image-to-Video Generation with Bridge Models0
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention0
OSV: One Step is Enough for High-Quality Image to Video Generation0
GenRec: Unifying Video Generation and Recognition with Diffusion ModelsCode0
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality DataCode0
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model0
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation0
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers0
Dance Any Beat: Blending Beats with Visuals in Dance Video Generation0
Show:102550
← PrevPage 7 of 9Next →

No leaderboard results yet.