| Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning | Mar 28, 2022 | Action ClassificationContrastive Learning | CodeCode Available | 1 |
| DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval | Jun 10, 2025 | Image CaptioningRetrieval | CodeCode Available | 1 |
| Time-Contrastive Networks: Self-Supervised Learning from Video | Apr 23, 2017 | Metric Learningreinforcement-learning | CodeCode Available | 1 |
| VADER: Video Alignment Differencing and Retrieval | Mar 23, 2023 | MisinformationRetrieval | —Unverified | 0 |
| A Comprehensive Review of Few-shot Action Recognition | Jul 20, 2024 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering | Jul 3, 2024 | Contrastive LearningLanguage Modelling | —Unverified | 0 |
| AniClipart: Clipart Animation with Text-to-Video Priors | Apr 18, 2024 | Image to Video GenerationText-to-Video Generation | —Unverified | 0 |
| Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment | Jul 24, 2023 | RetrievalText to Video Retrieval | —Unverified | 0 |
| Audio-Sync Video Generation with Multi-Stream Temporal Control | Jun 9, 2025 | Audio-Visual SynchronizationVideo Alignment | —Unverified | 0 |
| Book2Movie: Aligning Video Scenes With Book Chapters | Jun 1, 2015 | Video Alignment | —Unverified | 0 |
| ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer | Jun 26, 2023 | Click-Through Rate PredictionDynamic Time Warping | —Unverified | 0 |
| DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models | May 11, 2025 | parameter-efficient fine-tuningVideo Alignment | —Unverified | 0 |
| DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation | Apr 21, 2025 | AttributeDenoising | —Unverified | 0 |
| FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing | Mar 10, 2024 | Image GenerationText-to-Video Editing | —Unverified | 0 |
| Frequency-aware Event-based Video Deblurring for Real-World Motion Blur | Jan 1, 2024 | DeblurringVideo Alignment | —Unverified | 0 |
| Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback | Dec 3, 2024 | ObjectOffline RL | —Unverified | 0 |
| Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content | Oct 10, 2024 | Video AlignmentVideo Generation | —Unverified | 0 |
| Learning by Aligning 2D Skeleton Sequences and Multi-Modality Fusion | May 31, 2023 | RetrievalSelf-Supervised Learning | —Unverified | 0 |
| Learning by Aligning Videos in Time | Mar 31, 2021 | Representation LearningRetrieval | —Unverified | 0 |
| Learning Robust Video Synchronization without Annotations | Oct 19, 2016 | Video AlignmentVideo Synchronization | —Unverified | 0 |
| Learning to Align Images using Weak Geometric Supervision | Aug 4, 2018 | Video Alignment | —Unverified | 0 |
| Learning to Ground Instructional Articles in Videos through Narrations | Jun 6, 2023 | ArticlesVideo Alignment | —Unverified | 0 |
| Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment | Sep 22, 2024 | Contrastive Learningcross-modal alignment | —Unverified | 0 |
| Learning to Predict Activity Progress by Self-Supervised Video Alignment | Jan 1, 2024 | Representation LearningVideo Alignment | —Unverified | 0 |
| STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment | Oct 12, 2023 | Continual LearningRepresentation Learning | —Unverified | 0 |