| Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training | Dec 8, 2024 | Video Generation | —Unverified | 0 |
| Mind the Time: Temporally-Controlled Multi-Event Video Generation | Dec 6, 2024 | Video Generation | —Unverified | 0 |
| DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models | Dec 5, 2024 | Temporal SequencesVideo Generation | —Unverified | 0 |
| PaintScene4D: Consistent 4D Scene Generation from Text Prompts | Dec 5, 2024 | Scene GenerationVideo Generation | —Unverified | 0 |
| IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation | Dec 5, 2024 | DisentanglementTalking Head Generation | —Unverified | 0 |
| MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation | Dec 5, 2024 | Portrait AnimationVideo Generation | —Unverified | 0 |
| Movie Gen: SWOT Analysis of Meta's Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries | Dec 5, 2024 | Video Generation | —Unverified | 0 |
| GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration | Dec 5, 2024 | AttributeHallucination | —Unverified | 0 |
| Instructional Video Generation | Dec 5, 2024 | Video Generation | —Unverified | 0 |
| Mimir: Improving Video Diffusion Models for Precise Text Understanding | Dec 4, 2024 | DecoderReading Comprehension | —Unverified | 0 |
| Advancing Auto-Regressive Continuation for Video Frames | Dec 4, 2024 | Autonomous DrivingOptical Flow Estimation | —Unverified | 0 |
| SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model | Dec 4, 2024 | Video Generation | —Unverified | 0 |
| Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention | Dec 4, 2024 | Autonomous DrivingVideo Generation | —Unverified | 0 |
| Imagine360: Immersive 360 Video Generation from Perspective Anchor | Dec 4, 2024 | DenoisingVideo Denoising | —Unverified | 0 |
| AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction | Dec 3, 2024 | 3D ReconstructionVideo Generation | —Unverified | 0 |
| OmniCreator: Self-Supervised Unified Generation with Universal Editing | Dec 3, 2024 | DenoisingSemantic correspondence | —Unverified | 0 |
| Motion Prompting: Controlling Video Generation with Motion Trajectories | Dec 3, 2024 | Video Generation | —Unverified | 0 |
| Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback | Dec 3, 2024 | ObjectOffline RL | —Unverified | 0 |
| InfinityDrive: Breaking Time Limits in Driving World Models | Dec 2, 2024 | Autonomous DrivingDiversity | —Unverified | 0 |
| Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation | Dec 2, 2024 | DiversityVideo Generation | —Unverified | 0 |
| CPA: Camera-pose-awareness Diffusion Transformer for Video Generation | Dec 2, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| World-consistent Video Diffusion with Explicit 3D Modeling | Dec 2, 2024 | 3D GenerationImage Generation | —Unverified | 0 |
| FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait | Dec 2, 2024 | Image AnimationVideo Generation | —Unverified | 0 |
| MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models | Dec 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation | Dec 1, 2024 | Video Generation | —Unverified | 0 |
| DIVD: Deblurring with Improved Video Diffusion Model | Dec 1, 2024 | Deblurringmodel | —Unverified | 0 |
| Human Action CLIPs: Detecting AI-generated Human Motion | Nov 30, 2024 | Video Generation | —Unverified | 0 |
| Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning | Nov 30, 2024 | Autonomous DrivingMotion Generation | CodeCode Available | 0 |
| Fleximo: Towards Flexible Text-to-Human Motion Video Generation | Nov 29, 2024 | Image to Video GenerationLarge Language Model | —Unverified | 0 |
| Motion Modes: What Could Happen Next? | Nov 29, 2024 | DiversityObject | —Unverified | 0 |
| SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing | Nov 28, 2024 | Intent RecognitionModel Selection | —Unverified | 0 |
| OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation | Nov 28, 2024 | Video Generation | —Unverified | 0 |
| MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation | Nov 28, 2024 | Video Generation | —Unverified | 0 |
| Trajectory Attention for Fine-grained Video Motion Control | Nov 28, 2024 | Inductive BiasVideo Editing | —Unverified | 0 |
| MotionCharacter: Identity-Preserving and Motion Controllable Human Video Generation | Nov 27, 2024 | AttributeVideo Generation | —Unverified | 0 |
| AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers | Nov 27, 2024 | Camera Pose EstimationPose Estimation | —Unverified | 0 |
| Towards Chunk-Wise Generation for Long Videos | Nov 27, 2024 | DenoisingGPU | —Unverified | 0 |
| Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models | Nov 27, 2024 | Model CompressionVideo Generation | —Unverified | 0 |
| Free^2Guide: Gradient-Free Path Integral Control for Enhancing Text-to-Video Generation with Large Vision-Language Models | Nov 26, 2024 | Reinforcement Learning (RL)Text-to-Video Generation | —Unverified | 0 |
| Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey | Nov 26, 2024 | DeepFake DetectionFace Swapping | —Unverified | 0 |
| AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation | Nov 26, 2024 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation | Nov 26, 2024 | Video Generation | —Unverified | 0 |
| PhysMotion: Physics-Grounded Dynamics From a Single Image | Nov 26, 2024 | Video Generation | —Unverified | 0 |
| DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation | Nov 25, 2024 | Large Language ModelMotion Planning | —Unverified | 0 |
| Pathways on the Image Manifold: Image Editing via Video Generation | Nov 25, 2024 | Text-based Image EditingVideo Generation | —Unverified | 0 |
| Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric | Nov 25, 2024 | Video GenerationVideo Quality Assessment | —Unverified | 0 |
| LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis | Nov 24, 2024 | DiversityImage Animation | —Unverified | 0 |
| Optical-Flow Guided Prompt Optimization for Coherent Video Generation | Nov 23, 2024 | Optical Flow EstimationVideo Generation | —Unverified | 0 |
| Importance-Based Token Merging for Efficient Image and Video Generation | Nov 23, 2024 | Image GenerationVideo Generation | —Unverified | 0 |
| Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification | Nov 22, 2024 | Autonomous DrivingText-to-Video Generation | CodeCode Available | 0 |