| T-SVG: Text-Driven Stereoscopic Video Generation | Dec 12, 2024 | Depth EstimationText-to-Video Generation | —Unverified | 0 | 0 |
| Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion | Jan 8, 2025 | DenoisingDiversity | —Unverified | 0 | 0 |
| Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation | Mar 5, 2024 | DenoisingImage Animation | —Unverified | 0 | 0 |
| Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis | Apr 20, 2025 | 2kKnowledge Distillation | —Unverified | 0 | 0 |
| Tutorial on Diffusion Models for Imaging and Vision | Mar 26, 2024 | Image GenerationText to Image Generation | —Unverified | 0 | 0 |
| TVG: A Training-free Transition Video Generation Method with Diffusion Models | Aug 24, 2024 | GPRVideo Generation | —Unverified | 0 | 0 |
| UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions | Jun 16, 2025 | 4k8k | —Unverified | 0 | 0 |
| Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models | Mar 26, 2025 | Video Generation | —Unverified | 0 | 0 |
| Understanding World or Predicting Future? A Comprehensive Survey of World Models | Nov 21, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 | 0 |
| UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation | Feb 6, 2025 | Computational EfficiencyVideo Generation | —Unverified | 0 | 0 |
| Unified Dense Prediction of Video Diffusion | Mar 12, 2025 | PredictionVideo Generation | —Unverified | 0 | 0 |
| Unified Video Action Model | Feb 28, 2025 | modelPrediction | —Unverified | 0 | 0 |
| UniForm: A Unified Multi-Task Diffusion Transformer for Audio-Video Generation | Feb 6, 2025 | Audio GenerationDiversity | —Unverified | 0 | 0 |
| UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation | May 30, 2025 | Video Generation | —Unverified | 0 | 0 |
| UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics | Dec 10, 2024 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| UniVG: Towards UNIfied-modal Video Generation | Jan 17, 2024 | Video Generation | —Unverified | 0 | 0 |
| Unlearning Concepts from Text-to-Video Diffusion Models | Jul 19, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation | Jun 3, 2024 | Autonomous DrivingVideo Generation | —Unverified | 0 | 0 |
| Unpaired Cartoon Image Synthesis via Gated Cycle Mapping | Jan 1, 2022 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| Unsupervised Bi-directional Flow-based Video Generation from one Snapshot | Mar 3, 2019 | Video Generation | —Unverified | 0 | 0 |
| V3GAN: Decomposing Background, Foreground and Motion for Video Generation | Mar 26, 2022 | Generative Adversarial NetworkVideo Generation | —Unverified | 0 | 0 |
| VACT: A Video Automatic Causal Testing System and a Benchmark | Mar 8, 2025 | Large Language ModelVideo Generation | —Unverified | 0 | 0 |
| VAST 1.0: A Unified Framework for Controllable and Consistent Video Generation | Dec 21, 2024 | Video Generation | —Unverified | 0 | 0 |
| VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control | Jul 17, 2024 | Video Generation | —Unverified | 0 | 0 |
| VEnhancer: Generative Space-Time Enhancement for Video Generation | Jul 10, 2024 | Data AugmentationSuper-Resolution | —Unverified | 0 | 0 |
| V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation | Jun 4, 2024 | Video Generation | —Unverified | 0 | 0 |
| VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption | May 17, 2025 | DecoderPosition | —Unverified | 0 | 0 |
| ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models | Nov 16, 2024 | HallucinationVideo Generation | —Unverified | 0 | 0 |
| ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler | Oct 8, 2024 | GPUVideo Generation | —Unverified | 0 | 0 |
| ViDA-MAN: Visual Dialog with Digital Humans | Oct 26, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 | 0 |
| VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation | Feb 11, 2025 | Image to Video GenerationObject | —Unverified | 0 | 0 |
| VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control | Jan 2, 2025 | Talking Head GenerationVideo Generation | —Unverified | 0 | 0 |
| Video as the New Language for Real-World Decision Making | Feb 27, 2024 | Decision MakingIn-Context Learning | —Unverified | 0 | 0 |
| VideoAuteur: Towards Long Narrative Video Generation | Jan 10, 2025 | Video Generation | —Unverified | 0 | 0 |
| Video Autoencoder: self-supervised disentanglement of static 3D structure and motion | Oct 6, 2021 | Camera Pose EstimationDisentanglement | —Unverified | 0 | 0 |
| Video-Bench: Human-Aligned Video Generation Benchmark | Jan 1, 2025 | Large Language ModelVideo Generation | —Unverified | 0 | 0 |
| VideoBooth: Diffusion-based Video Generation with Image Prompts | Dec 1, 2023 | Video Generation | —Unverified | 0 | 0 |
| Video Content Swapping Using GAN | Nov 21, 2021 | Data AugmentationVideo Generation | —Unverified | 0 | 0 |
| Video Creation by Demonstration | Dec 12, 2024 | Video Generation | —Unverified | 0 | 0 |
| VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning | Sep 26, 2023 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| VideoDPO: Omni-Preference Alignment for Video Diffusion Generation | Dec 18, 2024 | Image GenerationText-to-Video Generation | —Unverified | 0 | 0 |
| VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning | Nov 2, 2023 | AttributeText-to-Video Generation | —Unverified | 0 | 0 |
| Video Editing via Factorized Diffusion Distillation | Mar 14, 2024 | Video EditingVideo Generation | —Unverified | 0 | 0 |
| VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation | Mar 4, 2019 | Predict Future Video FramesVideo Generation | —Unverified | 0 | 0 |
| VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation | Sep 1, 2023 | DecoderImage Generation | —Unverified | 0 | 0 |
| Video Generation Beyond a Single Clip | Apr 15, 2023 | Video Generation | —Unverified | 0 | 0 |
| Video Generation from Text Employing Latent Path Construction for Temporal Modeling | Jul 29, 2021 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Video Generation with Consistency Tuning | Mar 11, 2024 | Video Generation | —Unverified | 0 | 0 |
| Video Generation with Learned Action Prior | Jun 20, 2024 | Image GenerationImage Reconstruction | —Unverified | 0 | 0 |
| VideoGen: Generative Modeling of Videos using VQ-VAE and Transformers | Jan 1, 2021 | PositionVideo Generation | —Unverified | 0 | 0 |