| Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era | Sep 3, 2024 | Scene UnderstandingShadow Detection | CodeCode Available | 2 |
| Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets | Sep 2, 2024 | Video AlignmentVideo Editing | —Unverified | 0 |
| CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track | Aug 24, 2024 | Autonomous DrivingObject | —Unverified | 0 |
| VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment | Aug 21, 2024 | Video AlignmentVideo Editing | CodeCode Available | 2 |
| Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos | Aug 20, 2024 | Video Editing | —Unverified | 0 |
| Language-Driven Interactive Shadow Detection | Aug 16, 2024 | DescriptiveShadow Detection | CodeCode Available | 0 |
| DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency | Aug 14, 2024 | text-guided-image-editingVideo Editing | —Unverified | 0 |
| Segment Anything for Videos: A Systematic Survey | Jul 31, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 5 |
| Fine-gained Zero-shot Video Sampling | Jul 31, 2024 | Image GenerationVideo Editing | —Unverified | 0 |
| Text-based Talking Video Editing with Cascaded Conditional Diffusion | Jul 20, 2024 | Video Editing | —Unverified | 0 |
| Multi-sentence Video Grounding for Long Video Generation | Jul 18, 2024 | Moment RetrievalRetrieval | —Unverified | 0 |
| Rethinking the Architecture Design for Efficient Generic Event Boundary Detection | Jul 17, 2024 | Boundary DetectionGeneric Event Boundary Detection | CodeCode Available | 1 |
| InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models | Jul 15, 2024 | ObjectVideo Editing | —Unverified | 0 |
| Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN | Jul 8, 2024 | DisentanglementVideo Editing | —Unverified | 0 |
| Transformer-based Image and Video Inpainting: Current Challenges and Future Directions | Jun 28, 2024 | Image InpaintingVideo Editing | —Unverified | 0 |
| Diffusion Model-Based Video Editing: A Survey | Jun 26, 2024 | modelSurvey | CodeCode Available | 3 |
| MVOC: a training-free multiple video object composition method with diffusion models | Jun 22, 2024 | Image to Video GenerationObject | CodeCode Available | 1 |
| A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models | Jun 20, 2024 | Video Editing | CodeCode Available | 3 |
| V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Jun 20, 2024 | AttributeVideo Editing | —Unverified | 0 |
| VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing | Jun 18, 2024 | Video Editing | —Unverified | 0 |
| VideoGUI: A Benchmark for GUI Automation from Instructional Videos | Jun 14, 2024 | Video Editing | —Unverified | 0 |
| Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion | Jun 13, 2024 | Optical Flow EstimationVideo Editing | —Unverified | 0 |
| COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing | Jun 13, 2024 | DenoisingGPU | CodeCode Available | 1 |
| 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation | Jun 12, 2024 | Instance SegmentationSemantic Segmentation | —Unverified | 0 |
| HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness | Jun 11, 2024 | ObjectVideo Editing | —Unverified | 0 |
| Compositional Video Generation as Flow Equalization | Jun 10, 2024 | Video EditingVideo Generation | CodeCode Available | 2 |
| FRAG: Frequency Adapting Group for Diffusion Video Editing | Jun 10, 2024 | DenoisingVideo Editing | CodeCode Available | 2 |
| NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing | Jun 10, 2024 | SchedulingVideo Editing | CodeCode Available | 2 |
| Training-Free Robust Interactive Video Object Segmentation | Jun 8, 2024 | Interactive Video Object SegmentationObject | —Unverified | 0 |
| Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior | Jun 7, 2024 | Consistent Character GenerationOptical Flow Estimation | CodeCode Available | 0 |
| Zero-Shot Video Editing through Adaptive Sliding Score Distillation | Jun 7, 2024 | DenoisingText-to-Video Generation | —Unverified | 0 |
| Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting | Jun 4, 2024 | 3DGSNeRF | —Unverified | 0 |
| The Curious Case of End Token: A Zero-Shot Disentangled Image Editing using CLIP | Jun 1, 2024 | AttributeVideo Editing | —Unverified | 0 |
| 2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation | Jun 1, 2024 | Autonomous DrivingPanoptic Segmentation | —Unverified | 0 |
| Streaming Video Diffusion: Online Video Editing with Diffusion Models | May 30, 2024 | Video Editing | CodeCode Available | 1 |
| MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | May 30, 2024 | DenoisingGPU | CodeCode Available | 3 |
| RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives | May 28, 2024 | AttributeVideo Editing | CodeCode Available | 1 |
| Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection | May 27, 2024 | Image GenerationVideo Editing | —Unverified | 0 |
| I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models | May 26, 2024 | Video Editing | —Unverified | 0 |
| ReVideo: Remake a Video with Motion and Content Control | May 22, 2024 | Video EditingVideo Generation | —Unverified | 0 |
| Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | May 20, 2024 | Image GenerationVideo Editing | CodeCode Available | 2 |
| Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | May 7, 2024 | ObjectVideo Editing | —Unverified | 0 |
| V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection | Apr 25, 2024 | Prompt LearningVideo Editing | CodeCode Available | 0 |
| GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I Diffusion Models | Apr 18, 2024 | Video Editing | —Unverified | 0 |
| Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model | Apr 15, 2024 | GPUImage Generation | —Unverified | 0 |
| S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for Face Video Editing | Apr 11, 2024 | AttributeVideo Editing | —Unverified | 0 |
| Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models | Apr 8, 2024 | Video Editing | —Unverified | 0 |
| AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment | Apr 7, 2024 | Video EditingVideo Generation | —Unverified | 0 |
| ExpressEdit: Video Editing with Natural Language and Sketching | Mar 26, 2024 | Video Editing | —Unverified | 0 |
| EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing | Mar 24, 2024 | AttributeVideo Editing | —Unverified | 0 |