SOTAVerified

Image to Video Generation

Image to Video Generation refers to the task of generating a sequence of video frames based on a single still image or a set of still images. The goal is to produce a video that is coherent and consistent in terms of appearance, motion, and style, while also being temporally consistent, meaning that the generated video should look like a coherent sequence of frames that are temporally ordered. This task is typically tackled using deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), that are trained on large datasets of videos. The models learn to generate plausible video frames that are conditioned on the input image, as well as on any other auxiliary information, such as a sound or text track.

Papers

Showing 3140 of 85 papers

TitleStatusHype
OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation0
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation0
Fleximo: Towards Flexible Text-to-Human Motion Video Generation0
Identity-Preserving Text-to-Video Generation by Frequency DecompositionCode4
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative ModelsCode5
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human AnimationCode7
A Survey of Emerging Approaches and Advances in Video Generation0
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation0
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation0
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale0
Show:102550
← PrevPage 4 of 9Next →

No leaderboard results yet.