FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks Mar 24, 2022 Action Recognition Retrieval
Code Code Available 0Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA Apr 11, 2022 Retrieval Video Retrieval
Code Code Available 0Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos Dec 15, 2021 Retrieval Triplet
Code Code Available 0RaP: Redundancy-aware Video-language Pre-training for Text-Video Retrieval Oct 13, 2022 Contrastive Learning Retrieval
Code Code Available 0T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval Apr 20, 2021 Retrieval Video Retrieval
Code Code Available 0Exploring Temporal Concurrency for Video-Language Representation Learning Jan 1, 2023 Dynamic Time Warping Metric Learning
Code Code Available 0Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Jul 20, 2018 Face Generation Lip Reading
Code Code Available 0Person Search in Videos with One Portrait Through Visual and Temporal Links Jul 27, 2018 Person Re-Identification Person Search
Code Code Available 0Video Logo Retrieval based on local Features Aug 11, 2018 Image Retrieval Retrieval
Code Code Available 0Object Priors for Classifying and Localizing Unseen Actions Apr 10, 2021 Action Classification Action Localization
Code Code Available 0TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval Apr 7, 2025 Contrastive Learning Retrieval
Code Code Available 0Zorro: the masked multimodal transformer Jan 23, 2023 Audio Tagging Multimodal Deep Learning
Code Code Available 0Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning Mar 6, 2020 Density Estimation Noise Estimation
Code Code Available 0Robustness Analysis of Video-Language Models Against Visual and Language Perturbations Jul 5, 2022 Language Modeling Language Modelling
Code Code Available 0MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian Jun 20, 2023 Cross-Lingual Transfer Retrieval
Code Code Available 0MAMA: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning Jul 4, 2024 Language Modeling Language Modelling
Code Code Available 0WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge Dec 15, 2023 Information Retrieval Knowledge Distillation
Code Code Available 0Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 Jun 29, 2022 Multi-Instance Retrieval Retrieval
Code Code Available 0Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval Nov 21, 2022 All Retrieval
Code Code Available 0Enhancing Subsequent Video Retrieval via Vision-Language Models (VLMs) Mar 21, 2025 Representation Learning Retrieval
Code Code Available 0Learn to Understand Negation in Video Retrieval Apr 30, 2022 Natural Language Queries Negation
Code Code Available 0Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer Feb 4, 2023 Computational Efficiency Question Answering
Code Code Available 0Efficient Cross-Modal Video Retrieval with Meta-Optimized Frames Oct 16, 2022 Bilevel Optimization Retrieval
Code Code Available 0Win-Fail Action Recognition Feb 15, 2021 Action Recognition Action Understanding
Code Code Available 0Learning to Retrieve Videos by Asking Questions May 11, 2022 AI Agent Retrieval
Code Code Available 0Learning to Locate Visual Answer in Video Corpus Using Question Oct 11, 2022 Contrastive Learning Language Modelling
Code Code Available 0Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval Jun 11, 2018 Image-text Retrieval Retrieval
Code Code Available 0A Joint Sequence Fusion Model for Video Question Answering and Retrieval Aug 7, 2018 Decoder Multiple-choice
Code Code Available 0ECO: Efficient Convolutional Network for Online Video Understanding Apr 24, 2018 Action Classification Action Recognition
Code Code Available 0TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm Sep 30, 2024 Retrieval Video Retrieval
Code Code Available 0Dual Encoding for Zero-Example Video Retrieval Sep 17, 2018 Ad-hoc video search Retrieval
Code Code Available 0Learning from Video and Text via Large-Scale Discriminative Clustering Jul 27, 2017 Action Recognition Clustering
Code Code Available 0Discriminative Residual Analysis for Image Set Classification with Posture and Age Variations Aug 23, 2020 General Classification Retrieval
Code Code Available 0Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering Apr 15, 2025 Partially Relevant Video Retrieval Retrieval
Code Code Available 0Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval Sep 15, 2023 Retrieval Video Classification
Code Code Available 0Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains Sep 1, 2023 Change Point Detection Instruction Following
Code Code Available 0