Self-supervised Video Representation Learning with Cascade Positive Retrieval Jan 20, 2022 Action Recognition Contrastive Learning
Code Code Available 05 MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian Jun 20, 2023 Cross-Lingual Transfer Retrieval
Code Code Available 05 Dual Encoding for Zero-Example Video Retrieval Sep 17, 2018 Ad-hoc video search Retrieval
Code Code Available 05 FIVR: Fine-grained Incident Video Retrieval Sep 11, 2018 Benchmarking Retrieval
Code Code Available 05 Self-supervised Video Representation Learning by Context and Motion Decoupling Apr 2, 2021 Action Recognition CPU
Code Code Available 05 Is Multimodal Vision Supervision Beneficial to Language? Feb 10, 2023 Image Retrieval Natural Language Understanding
Code Code Available 05 TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm Sep 30, 2024 Retrieval Video Retrieval
Code Code Available 05 Joint Searching and Grounding: Multi-Granularity Video Content Retrieval Oct 23, 2023 Contrastive Learning Retrieval
Code Code Available 05 WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge Dec 15, 2023 Information Retrieval Knowledge Distillation
Code Code Available 05 Win-Fail Action Recognition Feb 15, 2021 Action Recognition Action Understanding
Code Code Available 05 Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering Apr 15, 2025 Partially Relevant Video Retrieval Retrieval
Code Code Available 05 Deep Hashing with Category Mask for Fast Video Retrieval Dec 22, 2017 Code Generation Deep Hashing
Code Code Available 05 Semantic Role Aware Correlation Transformer for Text to Video Retrieval Jun 26, 2022 Retrieval Text to Video Retrieval
Code Code Available 05 GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning Jul 20, 2022 Action Recognition Clustering
Code Code Available 05 Object Priors for Classifying and Localizing Unseen Actions Apr 10, 2021 Action Classification Action Localization
Code Code Available 05 Contextual Explainable Video Representation: Human Perception-based Understanding Dec 12, 2022 Action Detection Action Recognition
Code Code Available 05 LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers Jun 1, 2018 Copy Detection Retrieval
Code Code Available 05 Graph Based Temporal Aggregation for Video Retrieval Nov 4, 2020 Retrieval Video Retrieval
Code Code Available 05 MAMA: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning Jul 4, 2024 Language Modeling Language Modelling
Code Code Available 05 Discriminative Residual Analysis for Image Set Classification with Posture and Age Variations Aug 23, 2020 General Classification Retrieval
Code Code Available 05 Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains Sep 1, 2023 Change Point Detection Instruction Following
Code Code Available 05 Person Search in Videos with One Portrait Through Visual and Temporal Links Jul 27, 2018 Person Re-Identification Person Search
Code Code Available 05 AMIL: Adversarial Multi Instance Learning for Human Pose Estimation Mar 18, 2020 Multiple Instance Learning Pose Estimation
Code Code Available 05 Efficient Cross-Modal Video Retrieval with Meta-Optimized Frames Oct 16, 2022 Bilevel Optimization Retrieval
Code Code Available 05 Hashing with Mutual Information Mar 2, 2018 Image Retrieval Retrieval
Code Code Available 05 Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA Apr 11, 2022 Retrieval Video Retrieval
Code Code Available 05 You were saying? - Spoken Language in the V3C Dataset Dec 15, 2022 Retrieval Video Retrieval
Code Code Available 05 Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
Code Code Available 05 Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language Apr 1, 2022 Diversity Image Captioning
Code Code Available 05 Learning to Locate Visual Answer in Video Corpus Using Question Oct 11, 2022 Contrastive Learning Language Modelling
Code Code Available 05 Accommodating Audio Modality in CLIP for Multimodal Processing Mar 12, 2023 AudioCaps Contrastive Learning
Code Code Available 05 Video-Text Retrieval by Supervised Sparse Multi-Grained Learning Feb 19, 2023 Representation Learning Retrieval
Code Code Available 05 Exploring Temporal Concurrency for Video-Language Representation Learning Jan 1, 2023 Dynamic Time Warping Metric Learning
Code Code Available 05 Learning to Retrieve Videos by Asking Questions May 11, 2022 AI Agent Retrieval
Code Code Available 05 Zorro: the masked multimodal transformer Jan 23, 2023 Audio Tagging Multimodal Deep Learning
Code Code Available 05 Inter-intra Variant Dual Representations forSelf-supervised Video Recognition Jul 2, 2021 Contrastive Learning Representation Learning
Code Code Available 05 Efficient End-to-End Video Question Answering with Pyramidal Multimodal Transformer Feb 4, 2023 Computational Efficiency Question Answering
Code Code Available 05 Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks Oct 7, 2023 Action Recognition Multiple-choice
Code Code Available 05 Central Similarity Quantization for Efficient Image and Video Retrieval Aug 1, 2019 Quantization Retrieval
Code Code Available 05 Aligning Step-by-Step Instructional Diagrams to Video Demonstrations Mar 24, 2023 Contrastive Learning Image Retrieval
Code Code Available 05 Unmasked Teacher: Towards Training-Efficient Video Foundation Models Mar 28, 2023 Action Classification Action Recognition
Code Code Available 05 A Challenge to Build Neuro-Symbolic Video Agents May 20, 2025 Scene Classification Video Retrieval
Code Code Available 05 Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos Dec 15, 2021 Retrieval Triplet
Code Code Available 05 Learning from Video and Text via Large-Scale Discriminative Clustering Jul 27, 2017 Action Recognition Clustering
Code Code Available 05 ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models Jun 28, 2023 Retrieval Video Retrieval
Code Code Available 05 Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 Jun 29, 2022 Multi-Instance Retrieval Retrieval
Code Code Available 05 Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval Jun 11, 2018 Image-text Retrieval Retrieval
Code Code Available 05 T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval Apr 20, 2021 Retrieval Video Retrieval
Code Code Available 05 Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement Feb 21, 2024 Moment Retrieval Retrieval
Code Code Available 05 Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Jul 20, 2018 Face Generation Lip Reading
Code Code Available 05