| BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval | Oct 29, 2021 | Cross-Modal RetrievalRelation | CodeCode Available | 1 |
| ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning | Aug 20, 2019 | ISVRRetrieval | CodeCode Available | 1 |
| FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models | Mar 17, 2025 | SensitivityVideo Editing | —Unverified | 0 |
| Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions | Mar 7, 2025 | RetrievalVideo Retrieval | —Unverified | 0 |
| Contextual Augmented Global Contrast for Multimodal Intent Recognition | Jan 1, 2024 | Contrastive LearningIntent Recognition | —Unverified | 0 |
| Network-Based Video Recommendation Using Viewing Patterns and Modularity Analysis: An Integrated Framework | Aug 24, 2023 | ClusteringCollaborative Filtering | —Unverified | 0 |
| Few-shot Action Recognition via Intra- and Inter-Video Information Maximization | May 10, 2023 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Compound Prototype Matching for Few-shot Action Recognition | Jul 12, 2022 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Cross-Lingual Cross-Modal Consolidation for Effective Multilingual Video Corpus Moment Retrieval | Jul 1, 2022 | Moment RetrievalRetrieval | —Unverified | 0 |
| Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation | Jan 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |