| Text-based Talking Video Editing with Cascaded Conditional Diffusion | Jul 20, 2024 | Video Editing | —Unverified | 0 |
| Multi-sentence Video Grounding for Long Video Generation | Jul 18, 2024 | Moment RetrievalRetrieval | —Unverified | 0 |
| InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models | Jul 15, 2024 | ObjectVideo Editing | —Unverified | 0 |
| Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN | Jul 8, 2024 | DisentanglementVideo Editing | —Unverified | 0 |
| Transformer-based Image and Video Inpainting: Current Challenges and Future Directions | Jun 28, 2024 | Image InpaintingVideo Editing | —Unverified | 0 |
| V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data | Jun 20, 2024 | AttributeVideo Editing | —Unverified | 0 |
| VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing | Jun 18, 2024 | Video Editing | —Unverified | 0 |
| VideoGUI: A Benchmark for GUI Automation from Instructional Videos | Jun 14, 2024 | Video Editing | —Unverified | 0 |
| Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion | Jun 13, 2024 | Optical Flow EstimationVideo Editing | —Unverified | 0 |
| 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation | Jun 12, 2024 | Instance SegmentationSemantic Segmentation | —Unverified | 0 |