| Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs | Oct 4, 2018 | Action RecognitionImage Generation | CodeCode Available | 1 |
| Stochastic Variational Video Prediction | Oct 30, 2017 | PredictionVideo Generation | CodeCode Available | 1 |
| MoCoGAN: Decomposing Motion and Content for Video Generation | Jul 17, 2017 | Generative Adversarial NetworkVideo Generation | CodeCode Available | 1 |
| GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium | Jun 26, 2017 | Emotion Recognition in ConversationImage Generation | CodeCode Available | 1 |
| Sliced Wasserstein Generative Models | Jun 8, 2017 | Image GenerationVideo Generation | CodeCode Available | 1 |
| Temporal Generative Adversarial Nets with Singular Value Clipping | Nov 21, 2016 | Video Generation | CodeCode Available | 1 |
| World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving | Jul 17, 2025 | Accident AnticipationAutonomous Driving | —Unverified | 0 |
| Leveraging Pre-Trained Visual Models for AI-Generated Video Detection | Jul 17, 2025 | MisinformationVideo Generation | —Unverified | 0 |
| LoViC: Efficient Long Video Generation with Context Compression | Jul 17, 2025 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Taming Diffusion Transformer for Real-Time Mobile Video Generation | Jul 17, 2025 | Video Generation | —Unverified | 0 |
| Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective | Jul 11, 2025 | Video Generation | CodeCode Available | 0 |
| Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions | Jul 10, 2025 | Video Generation | —Unverified | 0 |
| Scaling RL to Long Videos | Jul 10, 2025 | Reinforcement Learning (RL)Spatial Reasoning | —Unverified | 0 |
| FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation | Jul 9, 2025 | DescriptiveText Generation | —Unverified | 0 |
| A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality | Jul 9, 2025 | DiversityVideo Generation | —Unverified | 0 |
| Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation | Jul 8, 2025 | Video Generation | —Unverified | 0 |
| AnyI2V: Animating Any Conditional Image with Motion Control | Jul 3, 2025 | Style TransferVideo Generation | —Unverified | 0 |
| LLM-based Realistic Safety-Critical Driving Video Generation | Jul 2, 2025 | Autonomous DrivingAutonomous Vehicles | —Unverified | 0 |
| Geometry-aware 4D Video Generation for Robot Manipulation | Jul 1, 2025 | Robot ManipulationVideo Generation | —Unverified | 0 |
| RoboScape: Physics-informed Embodied World Model | Jun 29, 2025 | 3D geometryDepth Estimation | CodeCode Available | 0 |
| Video Virtual Try-on with Conditional Diffusion Transformer Inpainter | Jun 26, 2025 | Video GenerationVideo Inpainting | —Unverified | 0 |
| Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models | Jun 26, 2025 | Texture SynthesisVideo Generation | —Unverified | 0 |
| ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models | Jun 26, 2025 | Spatial ReasoningVideo Generation | —Unverified | 0 |
| HieraSurg: Hierarchy-Aware Diffusion Model for Surgical Video Generation | Jun 26, 2025 | Panoptic SegmentationSegmentation | —Unverified | 0 |
| SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture | Jun 26, 2025 | DenoisingSinging Voice Synthesis | —Unverified | 0 |
| DFVEdit: Conditional Delta Flow Vector for Zero-shot Video Editing | Jun 26, 2025 | Video EditingVideo Generation | —Unverified | 0 |
| BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos | Jun 25, 2025 | Artifact DetectionBenchmarking | —Unverified | 0 |
| Video Perception Models for 3D Scene Synthesis | Jun 25, 2025 | 3D ReconstructionImage Generation | —Unverified | 0 |
| MinD: Unified Visual Imagination and Control via Hierarchical World Models | Jun 23, 2025 | Video GenerationVideo Prediction | —Unverified | 0 |
| OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation | Jun 23, 2025 | Human AnimationVideo Generation | —Unverified | 0 |
| RDPO: Real Data Preference Optimization for Physics Consistency Video Generation | Jun 23, 2025 | Video Generation | —Unverified | 0 |
| BulletGen: Improving 4D Reconstruction with Bullet-Time Generation | Jun 23, 2025 | 4D reconstructionDepth Estimation | —Unverified | 0 |
| Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition | Jun 20, 2025 | Temporal SequencesVideo Generation | —Unverified | 0 |
| PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models | Jun 19, 2025 | Image GenerationQuantization | —Unverified | 0 |
| Sekai: A Video Dataset towards World Exploration | Jun 18, 2025 | Video Generation | —Unverified | 0 |
| VideoMAR: Autoregressive Video Generatio with Continuous Tokens | Jun 17, 2025 | GPUImage Generation | —Unverified | 0 |
| Causally Steered Diffusion for Automated Video Counterfactual Generation | Jun 17, 2025 | counterfactualVideo Editing | CodeCode Available | 0 |
| UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions | Jun 16, 2025 | 4k8k | —Unverified | 0 |
| STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation | Jun 16, 2025 | Autonomous DrivingDenoising | —Unverified | 0 |
| iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer | Jun 15, 2025 | ObjectVideo Generation | —Unverified | 0 |
| M4V: Multi-Modal Mamba for Text-to-Video Generation | Jun 12, 2025 | MambaText-to-Video Generation | —Unverified | 0 |
| DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers | Jun 12, 2025 | Data AugmentationMarketing | —Unverified | 0 |
| GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning | Jun 12, 2025 | GPUVideo Generation | —Unverified | 0 |
| GenWorld: Towards Detecting AI-generated Real-world Simulation Videos | Jun 12, 2025 | Video Generation | —Unverified | 0 |
| PlayerOne: Egocentric World Simulator | Jun 11, 2025 | Video Generation | —Unverified | 0 |
| HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation | Jun 10, 2025 | Human AnimationHuman-Object Interaction Detection | —Unverified | 0 |
| How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models | Jun 10, 2025 | DenoisingVideo Generation | —Unverified | 0 |
| PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement | Jun 9, 2025 | Video Generation | —Unverified | 0 |
| Audio-Sync Video Generation with Multi-Stream Temporal Control | Jun 9, 2025 | Audio-Visual SynchronizationVideo Alignment | —Unverified | 0 |
| Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion | Jun 9, 2025 | GPUVideo Generation | —Unverified | 0 |