| DreamVideo: Composing Your Dream Videos with Customized Subject and Motion | Dec 7, 2023 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance | Dec 5, 2023 | Image to Video GenerationVideo Generation | —Unverified | 0 | 0 |
| DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation | Oct 17, 2024 | 3DGS4D reconstruction | —Unverified | 0 | 0 |
| DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving | Aug 29, 2024 | Autonomous DrivingDenoising | —Unverified | 0 | 0 |
| DriveScape: High-Resolution Driving Video Generation by Multi-View Feature Fusion | Jan 1, 2025 | Autonomous DrivingDenoising | —Unverified | 0 | 0 |
| DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation | Sep 9, 2024 | Autonomous DrivingVideo Generation | —Unverified | 0 | 0 |
| DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers | Dec 24, 2024 | NavSimTrajectory Planning | —Unverified | 0 | 0 |
| Dual-MTGAN: Stochastic and Deterministic Motion Transfer for Image-to-Video Synthesis | Feb 26, 2021 | Motion GenerationVideo Generation | —Unverified | 0 | 0 |
| DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization | May 4, 2025 | DenoisingText-to-Video Generation | —Unverified | 0 | 0 |
| Dual-Stream Diffusion Net for Text-to-Video Generation | Aug 16, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| DualX-VSR: Dual Axial SpatialTemporal Transformer for Real-World Video Super-Resolution without Motion Compensation | Jun 5, 2025 | Motion CompensationOptical Flow Estimation | —Unverified | 0 | 0 |
| Dynamic Camera Poses and Where to Find Them | Jan 1, 2025 | Point TrackingPose Estimation | —Unverified | 0 | 0 |
| Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM | May 26, 2025 | Image to Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions | Apr 13, 2022 | Video Generation | —Unverified | 0 | 0 |
| DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes | Dec 15, 2024 | DenoisingVideo Generation | —Unverified | 0 | 0 |
| DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation | Apr 21, 2025 | AttributeDenoising | —Unverified | 0 | 0 |
| E2VIDiff: Perceptual Events-to-Video Reconstruction using Diffusion Priors | Jul 11, 2024 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation | Aug 23, 2024 | Image GenerationVideo Generation | —Unverified | 0 | 0 |
| EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model | Apr 11, 2025 | Gesture GenerationVideo Generation | —Unverified | 0 | 0 |
| Echocardiography video synthesis from end diastolic semantic map via diffusion model | Oct 11, 2023 | DenoisingVideo Generation | —Unverified | 0 | 0 |
| EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation | Mar 28, 2025 | Medical Image AnalysisPrivacy Preserving | —Unverified | 0 | 0 |
| EEG to fMRI Synthesis: Is Deep Learning a candidate? | Sep 29, 2020 | Deep LearningEEG | —Unverified | 0 | 0 |
| Efficient training for future video generation based on hierarchical disentangled representation of latent variables | Jun 7, 2021 | Future predictionImage Generation | —Unverified | 0 | 0 |
| Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition | Mar 21, 2024 | Video Generation | —Unverified | 0 | 0 |
| EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation | Nov 13, 2024 | Video Generation | —Unverified | 0 | 0 |
| EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation | Jan 1, 2025 | Image GenerationText-to-Video Generation | —Unverified | 0 | 0 |
| EMO2: End-Effector Guided Audio-Driven Avatar Video Generation | Jan 18, 2025 | Gesture GenerationVideo Generation | —Unverified | 0 | 0 |
| EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Feb 27, 2024 | Video Generation | —Unverified | 0 | 0 |
| Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs | Aug 26, 2023 | In-Context LearningVideo Generation | —Unverified | 0 | 0 |
| Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning | Nov 17, 2023 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Enabling Versatile Controls for Video Diffusion Models | Mar 21, 2025 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Enabling Visual Composition and Animation in Unsupervised Video Generation | Mar 21, 2024 | Video Generation | —Unverified | 0 | 0 |
| Endora: Video Generation Models as Endoscopy Simulators | Mar 17, 2024 | Data AugmentationVideo Generation | —Unverified | 0 | 0 |
| Enhancing Facial Consistency in Conditional Video Generation via Facial Landmark Transformation | Dec 12, 2024 | Video Generation | —Unverified | 0 | 0 |
| Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory | Dec 23, 2024 | Video Generation | —Unverified | 0 | 0 |
| EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis | Mar 16, 2025 | Accident AnticipationVideo Generation | —Unverified | 0 | 0 |
| EVA: An Embodied World Model for Future Video Anticipation | Oct 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Evaluating Robot Policies in a World Model | May 31, 2025 | modelVideo Generation | —Unverified | 0 | 0 |
| EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation | Mar 24, 2025 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| Event-based High Dynamic Range Image and Very High Frame Rate Video Generation using Conditional Generative Adversarial Networks | Nov 20, 2018 | Video GenerationVocal Bursts Intensity Prediction | —Unverified | 0 | 0 |
| Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign Language Video | Nov 19, 2020 | Sign Language ProductionVideo Generation | —Unverified | 0 | 0 |
| Every Image Listens, Every Image Dances: Music-Driven Image Animation | Jan 30, 2025 | Image AnimationVideo Generation | —Unverified | 0 | 0 |
| Every Smile is Unique: Landmark-Guided Diverse Smile Generation | Feb 6, 2018 | Video Generation | —Unverified | 0 | 0 |
| Explaining Vision and Language through Graphs of Events in Space and Time | Aug 29, 2023 | Graph MatchingVideo Generation | —Unverified | 0 | 0 |
| Explorative Inbetweening of Time and Space | Mar 21, 2024 | DenoisingVideo Generation | —Unverified | 0 | 0 |
| Exploring the Hyperparameter Space of Image Diffusion Models for Echocardiogram Generation | Nov 2, 2023 | Video Generation | —Unverified | 0 | 0 |
| Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey | Nov 5, 2024 | 3D Scene ReconstructionAutonomous Driving | —Unverified | 0 | 0 |
| Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method | May 7, 2024 | Video Generation | —Unverified | 0 | 0 |
| Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis | Apr 30, 2025 | Disparity EstimationTransparent objects | —Unverified | 0 | 0 |
| FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability | Dec 6, 2023 | Face ModelVideo Generation | —Unverified | 0 | 0 |