| GenRec: Unifying Video Generation and Recognition with Diffusion Models | Aug 27, 2024 | Image to Video GenerationVideo Generation | CodeCode Available | 0 |
| Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance | Aug 27, 2024 | Clinical KnowledgeLesion Segmentation | CodeCode Available | 0 |
| SurGen: Text-Guided Diffusion Model for Surgical Video Generation | Aug 26, 2024 | Video Generation | —Unverified | 0 |
| Decoupled Video Generation with Chain of Training-free Diffusion Model Experts | Aug 24, 2024 | DenoisingVideo Generation | —Unverified | 0 |
| TVG: A Training-free Transition Video Generation Method with Diffusion Models | Aug 24, 2024 | GPRVideo Generation | —Unverified | 0 |
| EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation | Aug 23, 2024 | Image GenerationVideo Generation | —Unverified | 0 |
| xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Aug 22, 2024 | Dense CaptioningMotion Estimation | —Unverified | 0 |
| DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework | Aug 21, 2024 | Video Generation | —Unverified | 0 |
| TrackGo: A Flexible and Efficient Method for Controllable Video Generation | Aug 21, 2024 | Video Generation | —Unverified | 0 |
| Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation | Aug 19, 2024 | Instruction FollowingLarge Language Model | —Unverified | 0 |
| Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data | Aug 19, 2024 | DescriptiveImage to Video Generation | CodeCode Available | 0 |
| When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding | Aug 15, 2024 | Video CompressionVideo Generation | —Unverified | 0 |
| JPEG-LM: LLMs as Image Generators with Canonical Codec Representations | Aug 15, 2024 | Image GenerationQuantization | —Unverified | 0 |
| High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model | Aug 10, 2024 | Face GenerationTalking Face Generation | —Unverified | 0 |
| Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE | Aug 10, 2024 | Scene GenerationVideo Generation | —Unverified | 0 |
| Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics | Aug 8, 2024 | Video Generation | —Unverified | 0 |
| VidGen-1M: A Large-Scale Dataset for Text-to-video Generation | Aug 5, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion | Aug 1, 2024 | Face ReenactmentVideo Generation | —Unverified | 0 |
| Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation | Jul 31, 2024 | PositionVideo Generation | CodeCode Available | 0 |
| Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model | Jul 31, 2024 | BenchmarkingLarge Language Model | CodeCode Available | 0 |
| Fine-gained Zero-shot Video Sampling | Jul 31, 2024 | Image GenerationVideo Editing | —Unverified | 0 |
| FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention | Jul 29, 2024 | DenoisingVideo Generation | —Unverified | 0 |
| FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models | Jul 28, 2024 | DenoisingVideo Generation | CodeCode Available | 0 |
| Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions | Jul 27, 2024 | Computational EfficiencyVideo Generation | —Unverified | 0 |
| SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency | Jul 24, 2024 | NeRFNovel View Synthesis | —Unverified | 0 |
| Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data | Jul 23, 2024 | Video Generation | —Unverified | 0 |
| MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence | Jul 23, 2024 | Video Generation | —Unverified | 0 |
| Anchored Diffusion for Video Face Reenactment | Jul 21, 2024 | Face ReenactmentVideo Generation | —Unverified | 0 |
| Unlearning Concepts from Text-to-Video Diffusion Models | Jul 19, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion | Jul 18, 2024 | ImputationVideo Generation | —Unverified | 0 |
| Multi-sentence Video Grounding for Long Video Generation | Jul 18, 2024 | Moment RetrievalRetrieval | —Unverified | 0 |
| VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control | Jul 17, 2024 | Video Generation | —Unverified | 0 |
| Towards Understanding Unsafe Video Generation | Jul 17, 2024 | Image GenerationVideo Generation | CodeCode Available | 0 |
| A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication | Jul 15, 2024 | FairnessImage Generation | —Unverified | 0 |
| Learning Online Scale Transformation for Talking Head Video Generation | Jul 13, 2024 | Face ReenactmentVideo Generation | —Unverified | 0 |
| Bora: Biomedical Generalist Video Generation Model | Jul 12, 2024 | Cell TrackingData Augmentation | —Unverified | 0 |
| Inference Optimization of Foundation Models on AI Accelerators | Jul 12, 2024 | Inference OptimizationModel Compression | —Unverified | 0 |
| Still-Moving: Customized Video Generation without Customized Video Data | Jul 11, 2024 | Video Generation | —Unverified | 0 |
| E2VIDiff: Perceptual Events-to-Video Reconstruction using Diffusion Priors | Jul 11, 2024 | Image GenerationVideo Generation | —Unverified | 0 |
| VEnhancer: Generative Space-Time Enhancement for Video Generation | Jul 10, 2024 | Data AugmentationSuper-Resolution | —Unverified | 0 |
| Video-to-Audio Generation with Hidden Alignment | Jul 10, 2024 | Audio GenerationData Augmentation | —Unverified | 0 |
| Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task | Jul 9, 2024 | GPUText-to-Video Generation | CodeCode Available | 0 |
| The Tug-of-War Between Deepfake Generation and Detection | Jul 8, 2024 | Face SwappingMisinformation | —Unverified | 0 |
| VIMI: Grounding Video Generation through Multi-modal Instruction | Jul 8, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| This&That: Language-Gesture Controlled Video Generation for Robot Planning | Jul 8, 2024 | Task PlanningVideo Generation | —Unverified | 0 |
| T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models | Jul 8, 2024 | Video Generation | —Unverified | 0 |
| OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation | Jul 2, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| GVDIFF: Grounded Text-to-Video Generation with Diffusion Models | Jul 2, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 |
| SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix | Jun 29, 2024 | DenoisingVideo Generation | —Unverified | 0 |
| What Matters in Detecting AI-Generated Videos like Sora? | Jun 27, 2024 | Optical Flow EstimationVideo Generation | —Unverified | 0 |