| ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models | Nov 16, 2024 | HallucinationVideo Generation | —Unverified | 0 |
| ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler | Oct 8, 2024 | GPUVideo Generation | —Unverified | 0 |
| ViDA-MAN: Visual Dialog with Digital Humans | Oct 26, 2021 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation | Feb 11, 2025 | Image to Video GenerationObject | —Unverified | 0 |
| VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control | Jan 2, 2025 | Talking Head GenerationVideo Generation | —Unverified | 0 |
| Video as the New Language for Real-World Decision Making | Feb 27, 2024 | Decision MakingIn-Context Learning | —Unverified | 0 |
| VideoAuteur: Towards Long Narrative Video Generation | Jan 10, 2025 | Video Generation | —Unverified | 0 |
| Video Autoencoder: self-supervised disentanglement of static 3D structure and motion | Oct 6, 2021 | Camera Pose EstimationDisentanglement | —Unverified | 0 |
| Video-Bench: Human-Aligned Video Generation Benchmark | Jan 1, 2025 | Large Language ModelVideo Generation | —Unverified | 0 |
| VideoBooth: Diffusion-based Video Generation with Image Prompts | Dec 1, 2023 | Video Generation | —Unverified | 0 |
| Video Content Swapping Using GAN | Nov 21, 2021 | Data AugmentationVideo Generation | —Unverified | 0 |
| Video Creation by Demonstration | Dec 12, 2024 | Video Generation | —Unverified | 0 |
| VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning | Sep 26, 2023 | Image GenerationVideo Generation | —Unverified | 0 |
| VideoDPO: Omni-Preference Alignment for Video Diffusion Generation | Dec 18, 2024 | Image GenerationText-to-Video Generation | —Unverified | 0 |
| VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning | Nov 2, 2023 | AttributeText-to-Video Generation | —Unverified | 0 |
| Video Editing via Factorized Diffusion Distillation | Mar 14, 2024 | Video EditingVideo Generation | —Unverified | 0 |
| VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation | Mar 4, 2019 | Predict Future Video FramesVideo Generation | —Unverified | 0 |
| VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation | Sep 1, 2023 | DecoderImage Generation | —Unverified | 0 |
| Video Generation Beyond a Single Clip | Apr 15, 2023 | Video Generation | —Unverified | 0 |
| Video Generation from Text Employing Latent Path Construction for Temporal Modeling | Jul 29, 2021 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Video Generation with Consistency Tuning | Mar 11, 2024 | Video Generation | —Unverified | 0 |
| Video Generation with Learned Action Prior | Jun 20, 2024 | Image GenerationImage Reconstruction | —Unverified | 0 |
| VideoGen: Generative Modeling of Videos using VQ-VAE and Transformers | Jan 1, 2021 | PositionVideo Generation | —Unverified | 0 |
| VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention | Mar 19, 2025 | Video Generation | —Unverified | 0 |
| VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing | Feb 24, 2025 | Video EditingVideo Generation | —Unverified | 0 |
| Video-Infinity: Distributed Long Video Generation | Jun 24, 2024 | GPUVideo Generation | —Unverified | 0 |
| Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation | Dec 24, 2024 | Video Generation | —Unverified | 0 |
| VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models | Feb 4, 2025 | Motion Generationmotion prediction | —Unverified | 0 |
| Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation | Feb 1, 2025 | Image GenerationVideo Generation | —Unverified | 0 |
| VideoLCM: Video Latent Consistency Model | Dec 14, 2023 | Computational EfficiencyImage Generation | —Unverified | 0 |
| VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models | Mar 27, 2025 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| VideoMAR: Autoregressive Video Generatio with Continuous Tokens | Jun 17, 2025 | GPUImage Generation | —Unverified | 0 |
| VideoMerge: Towards Training-free Long Video Generation | Mar 13, 2025 | DenoisingVideo Generation | —Unverified | 0 |
| Video Motion Graphs | Mar 26, 2025 | Motion InterpolationVideo Frame Interpolation | —Unverified | 0 |
| VideoPanda: Video Panoramic Diffusion with Multi-view Attention | Apr 15, 2025 | Video Generation | —Unverified | 0 |
| Video Perception Models for 3D Scene Synthesis | Jun 25, 2025 | 3D ReconstructionImage Generation | —Unverified | 0 |
| VideoPhy: Evaluating Physical Commonsense for Video Generation | Jun 5, 2024 | Video Generation | —Unverified | 0 |
| VideoPoet: A Large Language Model for Zero-Shot Video Generation | Dec 21, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement | Nov 22, 2024 | Text-to-Video GenerationVideo Alignment | —Unverified | 0 |
| VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling | Mar 20, 2025 | 3DGSText to 3D | —Unverified | 0 |
| Video Signature: In-generation Watermarking for Latent Video Diffusion Models | May 31, 2025 | DecoderVideo Generation | —Unverified | 0 |
| Rethinking Video Super-Resolution: Towards Diffusion-Based Methods without Motion Alignment | Mar 5, 2025 | AllSuper-Resolution | —Unverified | 0 |
| Video-T1: Test-Time Scaling for Video Generation | Mar 24, 2025 | DenoisingVideo Generation | —Unverified | 0 |
| Video-to-Audio Generation with Fine-grained Temporal Semantics | Sep 23, 2024 | Audio GenerationVideo Generation | —Unverified | 0 |
| Video-to-Audio Generation with Hidden Alignment | Jul 10, 2024 | Audio GenerationData Augmentation | —Unverified | 0 |
| Video Virtual Try-on with Conditional Diffusion Transformer Inpainter | Jun 26, 2025 | Video GenerationVideo Inpainting | —Unverified | 0 |
| VideoWorld: Exploring Knowledge Learning from Unlabeled Videos | Jan 16, 2025 | Video Generation | —Unverified | 0 |
| VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models | Nov 30, 2023 | Semantic SegmentationVideo Editing | —Unverified | 0 |
| VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation | Nov 14, 2024 | DenoisingRobot Manipulation | —Unverified | 0 |
| VidPanos: Generative Panoramic Videos from Casual Panning Videos | Oct 17, 2024 | Image StitchingVideo Generation | —Unverified | 0 |