| ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning | Aug 20, 2019 | ISVRRetrieval | CodeCode Available | 1 |
| BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval | Oct 29, 2021 | Cross-Modal RetrievalRelation | CodeCode Available | 1 |
| FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models | Mar 17, 2025 | SensitivityVideo Editing | —Unverified | 0 |
| Few-shot Action Recognition via Intra- and Inter-Video Information Maximization | May 10, 2023 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions | Mar 7, 2025 | RetrievalVideo Retrieval | —Unverified | 0 |
| Self-supervised Video Retrieval Transformer Network | Apr 16, 2021 | RetrievalSelf-supervised Video Retrieval | —Unverified | 0 |
| Video Similarity and Alignment Learning on Partial Video Copy Detection | Aug 4, 2021 | Copy DetectionPartial Video Copy Detection | —Unverified | 0 |
| Compound Prototype Matching for Few-shot Action Recognition | Jul 12, 2022 | Action RecognitionFew-Shot action recognition | —Unverified | 0 |
| Contextual Augmented Global Contrast for Multimodal Intent Recognition | Jan 1, 2024 | Contrastive LearningIntent Recognition | —Unverified | 0 |
| Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation | Jan 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |