SOTAVerified

Video Generation

( Various Video Generation Tasks. Gif credit: MaGViT )

Papers

Showing 11011150 of 1466 papers

TitleStatusHype
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback0
Improving the Diffusability of Autoencoders0
Improving Video Generation with Human Feedback0
IM-Zero: Instance-level Motion Controllable Video Generation in a Zero-shot Manner0
Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models0
Inference Optimization of Foundation Models on AI Accelerators0
InfinityDrive: Breaking Time Limits in Driving World Models0
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution0
InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation0
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption0
Instructional Video Generation0
InstructVideo: Instructing Video Diffusion Models with Human Feedback0
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor0
Intention-driven Ego-to-Exo Video Generation0
Interactive Video Generation via Domain Adaptation0
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation0
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models0
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation0
Interspatial Attention for Efficient 4D Human Video Generation0
Investigating Memorization in Video Diffusion Models0
IPO: Iterative Preference Optimization for Text-to-Video Generation0
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation0
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization0
Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation0
Jointly Trained Image and Video Generation using Residual Vectors0
JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation0
JoyHallo: Digital human model for Mandarin0
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations0
JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation0
Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content0
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation0
Label-Conditioned Next-Frame Video Generation with Neural Flows0
LaMD: Latent Motion Diffusion for Image-Conditional Video Generation0
LAMP: Learn A Motion Pattern for Few-Shot Video Generation0
Large Motion Video Autoencoding with Cross-modal Video VAE0
Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training0
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation0
LayerAnimate: Layer-specific Control for Animation0
Layered Controllable Video Generation0
Learning Long-Term Style-Preserving Blind Video Temporal Consistency0
Learning Online Scale Transformation for Talking Head Video Generation0
Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression0
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation0
Learning Temporally Consistent Video Depth from Video Diffusion Priors0
Learning to Deblur and Generate High Frame Rate Video with an Event Camera0
Learning to Generate Videos Using Neural Uncertainty Priors0
Learning Universal Policies via Text-Guided Video Generation0
Learning World Models for Interactive Video Generation0
Lets Play Music: Audio-driven Performance Video Generation0
LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis0
Show:102550
← PrevPage 23 of 30Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MCVDFVD162,460Unverified
2VDMFVD161,396Unverified
3TGAN-v2 (128x128)FVD161,209Unverified
4MCVD (64x64)FVD161,143Unverified
5MoCoGAN-HD (256x256, unconditional)FVD16700Unverified
6MagicVideo (256x256, text-conditional)FVD16699Unverified
7TATS (256x256)FVD16635Unverified
8FIFO-DiffusionFVD128596.64Unverified
9DIGAN (128x128, unconditional)FVD16577Unverified
10LVDM (256x256, unconditional)FVD16552Unverified
#ModelMetricClaimedVerifiedStatus
1MoCoGANFVD score503Unverified
2Baseline (from LVT)FVD score320.9Unverified
3SVG-FP (from FVD)FVD score315.5Unverified
4CDNA (from FVD)FVD score296.5Unverified
5SV2P (from FVD)FVD score262.5Unverified
6SVG-LP (from vRNN)FVD score256.62Unverified
7WAMFVD score159.6Unverified
8VRNN 1LFVD score149.22Unverified
9SAVP (from vRNN)FVD score143.43Unverified
10Hier-VRNNFVD score143.4Unverified
#ModelMetricClaimedVerifiedStatus
1MoCoGAN-HD (128x128)FVD 16183.6Unverified
2TATS (128x128)FVD 16132.6Unverified
3Long-video GAN (256x256)FVD 16116.5Unverified
4DIGAN (128x128)FVD 16114.6Unverified
5Long-video GAN (128x128)FVD 16107.5Unverified
6LVDM (256x256)FVD 1695.2Unverified
7DDMIFVD 1666.25Unverified
8Latte + LeanVAEFVD 1649.59Unverified
9StyleSV (256x256)FVD 1649Unverified
#ModelMetricClaimedVerifiedStatus
1Video Diffusion ModelInception Score57Unverified
2TGAN-ODEInception Score15.2Unverified
3TGAN-FInception Score13.62Unverified
4MoCoGANInception Score12.42Unverified
5MoCoGAN-MDPInception Score11.86Unverified
6TGAN-SVCInception Score11.85Unverified
7VGANInception Score8.18Unverified
#ModelMetricClaimedVerifiedStatus
1TGAN-FInception Score22.91Unverified
2TGANv2Inception Score21.45Unverified
3TGANv2-ODEInception Score21.02Unverified
4MoCoGANInception Score12.42Unverified
5MoCoGAN-MDPInception Score11.86Unverified
6TGAN-SVCInception Score11.85Unverified
7VGANInception Score8.18Unverified
#ModelMetricClaimedVerifiedStatus
1Imagen original (constant=6)CLIP R-Precision92.12Unverified
2Imagen fully distilled (oscillate (15,1))CLIP R-Precision90.97Unverified
3Imagen distilled (constant=6)CLIP R-Precision90.88Unverified
4Imagen original (oscillate(15,1))CLIP R-Precision89.91Unverified
5Imagen fully distilled (constant=6)CLIP R-Precision89.68Unverified
6Imagen distilled (oscillate (15,1))CLIP R-Precision88.78Unverified
#ModelMetricClaimedVerifiedStatus
1DIGAN (256x256)FVD16156.7Unverified
2MoCoGAN-HD (128x128)FVD16144.7Unverified
3DIGAN (128x128)FVD16128.1Unverified
4LVDM (256x256)FVD1699Unverified
5TATS (128x128)FVD1694.6Unverified
6StyleSV (256x256)FVD1682.6Unverified
#ModelMetricClaimedVerifiedStatus
1TGANv2 (2020)Inception Score28.87Unverified
2DVD-GANInception Score27.38Unverified
3VideoGPTInception Score24.69Unverified
4TGANv2Inception Score24.34Unverified
5TGAN-FInception Score22.91Unverified
6TGANv2-ODEInception Score21.02Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFVD31.1Unverified
2MAGVITFVD9.9Unverified
#ModelMetricClaimedVerifiedStatus
1INR-VFVD16144Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFID2.16Unverified
#ModelMetricClaimedVerifiedStatus
1DVD-GANFID12.92Unverified
#ModelMetricClaimedVerifiedStatus
1DiT-XL/2 + CVAE-FT-SEFID8.59Unverified
#ModelMetricClaimedVerifiedStatus
1VideoAssembler (Zero-Shot, 256x256, class-conditional)FVD16252Unverified
#ModelMetricClaimedVerifiedStatus
1PG-SWGAN-3DFID404.1Unverified
#ModelMetricClaimedVerifiedStatus
1StyleSVFVD16207.2Unverified