| DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval | Jun 10, 2025 | Image CaptioningRetrieval | CodeCode Available | 1 |
| Audio-Sync Video Generation with Multi-Stream Temporal Control | Jun 9, 2025 | Audio-Visual SynchronizationVideo Alignment | —Unverified | 0 |
| Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation | May 29, 2025 | Portrait AnimationVideo Alignment | CodeCode Available | 2 |
| LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation | May 17, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models | May 11, 2025 | parameter-efficient fine-tuningVideo Alignment | —Unverified | 0 |
| HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation | May 7, 2025 | Human-Domain Subject-to-VideoSingle-Domain Subject-to-Video | CodeCode Available | 5 |
| DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation | Apr 21, 2025 | AttributeDenoising | —Unverified | 0 |
| Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization | Apr 5, 2025 | 3D GenerationVideo Alignment | CodeCode Available | 3 |
| VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion | Mar 11, 2025 | Image MattingVideo Alignment | CodeCode Available | 1 |
| Deep Understanding of Sign Language for Sign to Subtitle Alignment | Mar 5, 2025 | TranslationVideo Alignment | CodeCode Available | 0 |