SOTAVerified

Video Generation

( Various Video Generation Tasks. Gif credit: MaGViT )

Papers

Showing 501550 of 1466 papers

TitleStatusHype
Generative Disco: Text-to-Video Generation for Music VisualizationCode1
Generative Adversarial Graph Convolutional Networks for Human Action SynthesisCode1
Generating Videos with Dynamics-aware Implicit Generative Adversarial NetworksCode1
InTraGen: Trajectory-controlled Video Generation for Object InteractionsCode1
Moonshot: Towards Controllable Video Generation and Editing with Multimodal ConditionsCode1
Stochastic Variational Video PredictionCode1
Generating time-consistent dynamics with discriminator-guided image diffusion models0
Generating Persuasive Visual Storylines for Promotional Videos0
Deep Video Generation, Prediction and Completion of Human Action Sequences0
Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models0
DeepVerse: 4D Autoregressive Video Generation as a World Model0
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks0
Gender Bias in Text-to-Video Generation Models: A case study of Sora0
DeepRhythm: Exposing DeepFakes with Attentional Visual Heartbeat Rhythms0
GenDeF: Learning Generative Deformation Field for Video Generation0
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model0
4Diffusion: Multi-view Video Diffusion Model for 4D Generation0
How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models0
DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction0
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation0
Decouple Content and Motion for Conditional Image-to-Video Generation0
AnimateAnything: Consistent and Controllable Animation for Video Generation0
Modular Action Concept Grounding in Semantic Video Prediction0
How Far is Video Generation from World Model: A Physical Law Perspective0
How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models0
GameFactory: Creating New Games with Generative Interactive Videos0
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving0
G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer0
FVD: A new Metric for Video Generation0
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation0
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling0
Fundus to Fluorescein Angiography Video Generation as a Retinal Generative Foundation Model0
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation0
AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction0
Action Concept Grounding Network for Semantically-Consistent Video Generation0
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention0
FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers0
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks0
DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models0
AvatarShield: Visual Reinforcement Learning for Human-Centric Video Forgery Detection0
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models0
From Single Images to Motion Policies via Video-Generation Environment Representations0
From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models0
Dance Any Beat: Blending Beats with Visuals in Dance Video Generation0
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations0
AniClipart: Clipart Animation with Text-to-Video Priors0
How Do the Hearts of Deep Fakes Beat? Deep Fake Source Detection via Interpreting Residuals with Biological Signals0
HRVGAN: High Resolution Video Generation using Spatio-Temporal GAN0
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise0
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention0
Show:102550
← PrevPage 11 of 30Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MCVDFVD162,460Unverified
2VDMFVD161,396Unverified
3TGAN-v2 (128x128)FVD161,209Unverified
4MCVD (64x64)FVD161,143Unverified
5MoCoGAN-HD (256x256, unconditional)FVD16700Unverified
6MagicVideo (256x256, text-conditional)FVD16699Unverified
7TATS (256x256)FVD16635Unverified
8FIFO-DiffusionFVD128596.64Unverified
9DIGAN (128x128, unconditional)FVD16577Unverified
10LVDM (256x256, unconditional)FVD16552Unverified
#ModelMetricClaimedVerifiedStatus
1MoCoGANFVD score503Unverified
2Baseline (from LVT)FVD score320.9Unverified
3SVG-FP (from FVD)FVD score315.5Unverified
4CDNA (from FVD)FVD score296.5Unverified
5SV2P (from FVD)FVD score262.5Unverified
6SVG-LP (from vRNN)FVD score256.62Unverified
7WAMFVD score159.6Unverified
8VRNN 1LFVD score149.22Unverified
9SAVP (from vRNN)FVD score143.43Unverified
10Hier-VRNNFVD score143.4Unverified
#ModelMetricClaimedVerifiedStatus
1MoCoGAN-HD (128x128)FVD 16183.6Unverified
2TATS (128x128)FVD 16132.6Unverified
3Long-video GAN (256x256)FVD 16116.5Unverified
4DIGAN (128x128)FVD 16114.6Unverified
5Long-video GAN (128x128)FVD 16107.5Unverified
6LVDM (256x256)FVD 1695.2Unverified
7DDMIFVD 1666.25Unverified
8Latte + LeanVAEFVD 1649.59Unverified
9StyleSV (256x256)FVD 1649Unverified
#ModelMetricClaimedVerifiedStatus
1Video Diffusion ModelInception Score57Unverified
2TGAN-ODEInception Score15.2Unverified
3TGAN-FInception Score13.62Unverified
4MoCoGANInception Score12.42Unverified
5MoCoGAN-MDPInception Score11.86Unverified
6TGAN-SVCInception Score11.85Unverified
7VGANInception Score8.18Unverified
#ModelMetricClaimedVerifiedStatus
1TGAN-FInception Score22.91Unverified
2TGANv2Inception Score21.45Unverified
3TGANv2-ODEInception Score21.02Unverified
4MoCoGANInception Score12.42Unverified
5MoCoGAN-MDPInception Score11.86Unverified
6TGAN-SVCInception Score11.85Unverified
7VGANInception Score8.18Unverified
#ModelMetricClaimedVerifiedStatus
1Imagen original (constant=6)CLIP R-Precision92.12Unverified
2Imagen fully distilled (oscillate (15,1))CLIP R-Precision90.97Unverified
3Imagen distilled (constant=6)CLIP R-Precision90.88Unverified
4Imagen original (oscillate(15,1))CLIP R-Precision89.91Unverified
5Imagen fully distilled (constant=6)CLIP R-Precision89.68Unverified
6Imagen distilled (oscillate (15,1))CLIP R-Precision88.78Unverified
#ModelMetricClaimedVerifiedStatus
1DIGAN (256x256)FVD16156.7Unverified
2MoCoGAN-HD (128x128)FVD16144.7Unverified
3DIGAN (128x128)FVD16128.1Unverified
4LVDM (256x256)FVD1699Unverified
5TATS (128x128)FVD1694.6Unverified
6StyleSV (256x256)FVD1682.6Unverified
#ModelMetricClaimedVerifiedStatus
1TGANv2 (2020)Inception Score28.87Unverified
2DVD-GANInception Score27.38Unverified
3VideoGPTInception Score24.69Unverified
4TGANv2Inception Score24.34Unverified
5TGAN-FInception Score22.91Unverified
6TGANv2-ODEInception Score21.02Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFVD31.1Unverified
2MAGVITFVD9.9Unverified
#ModelMetricClaimedVerifiedStatus
1INR-VFVD16144Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFID2.16Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFID12.92Unverified
#ModelMetricClaimedVerifiedStatus
1DiT-XL/2 + CVAE-FT-SEFID8.59Unverified
#ModelMetricClaimedVerifiedStatus
1VideoAssembler (Zero-Shot, 256x256, class-conditional)FVD16252Unverified
#ModelMetricClaimedVerifiedStatus
1PG-SWGAN-3DFID404.1Unverified
#ModelMetricClaimedVerifiedStatus
1StyleSVFVD16207.2Unverified