An overview on the evaluated video retrieval tasks at TRECVID 2022 Jun 22, 2023 Ad-hoc video search Retrieval
Code Code Available 1MDMMT: Multidomain Multimodal Transformer for Video Retrieval Mar 19, 2021 Retrieval Text to Video Retrieval
Code Code Available 1CLIP2Video: Mastering Video-Text Retrieval via Image CLIP Jun 21, 2021 Language Modeling Language Modelling
Code Code Available 1CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Apr 18, 2021 Retrieval Text Retrieval
Code Code Available 1HowToCaption: Prompting LLMs to Transform Video Annotations at Scale Oct 7, 2023 Automatic Speech Recognition Video Captioning
Code Code Available 1Dual Encoding for Video Retrieval by Text Sep 10, 2020 Ad-hoc video search Retrieval
Code Code Available 1Clover: Towards A Unified Video-Language Alignment and Fusion Model Jul 16, 2022 Language Modeling Language Modelling
Code Code Available 1Multimedia Retrieval Through Unsupervised Hypergraph-Based Manifold Ranking Dec 1, 2019 Content-Based Image Retrieval Retrieval
Code Code Available 1Align and Prompt: Video-and-Language Pre-training with Entity Prompts Dec 17, 2021 cross-modal alignment Entity Alignment
Code Code Available 1Multi-Query Video Retrieval Jan 10, 2022 Retrieval Video Retrieval
Code Code Available 1AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant Nov 30, 2021 Question Answering Retrieval
Code Code Available 1CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 1Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss Sep 9, 2021 Mixture-of-Experts Retrieval
Code Code Available 1Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures Jul 27, 2023 Automatic Speech Recognition Contrastive Learning
Code Code Available 1GMMFormer v2: An Uncertainty-aware Framework for Partially Relevant Video Retrieval May 22, 2024 Partially Relevant Video Retrieval Retrieval
Code Code Available 1COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning Nov 1, 2020 Cross-Modal Retrieval Representation Learning
Code Code Available 1GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval Oct 8, 2023 Partially Relevant Video Retrieval Retrieval
Code Code Available 1Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Apr 1, 2021 Retrieval Text Retrieval
Code Code Available 1Contrastive Masked Autoencoders for Self-Supervised Video Hashing Nov 21, 2022 Decoder Retrieval
Code Code Available 1Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions Nov 19, 2021 Retrieval Super-Resolution
Code Code Available 1DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval Jun 24, 2021 Computational Efficiency Knowledge Distillation
Code Code Available 1Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation Jul 9, 2020 Few-Shot Image Classification Few-Shot Learning
Code Code Available 1HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training May 1, 2020 Language Modeling Language Modelling
Code Code Available 1Florence: A New Foundation Model for Computer Vision Nov 22, 2021 Action Classification Action Recognition
Code Code Available 1Disentangled Representation Learning for Text-Video Retrieval Mar 14, 2022 Representation Learning Retrieval
Code Code Available 1COSA: Concatenated Sample Pretrained Vision-Language Foundation Model Jun 15, 2023 Form model
Code Code Available 1AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Jun 16, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1CoVR-2: Automatic Data Construction for Composed Video Retrieval Aug 28, 2023 Composed Image Retrieval (CoIR) Composed Video Retrieval (CoVR)
Code Code Available 1Holistic Features are almost Sufficient for Text-to-Video Retrieval Jan 1, 2024 Retrieval text similarity
Code Code Available 1HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips Jun 7, 2019 Action Localization Long Video Retrieval (Background Removed)
Code Code Available 1Cross-Architecture Self-supervised Video Representation Learning May 26, 2022 Action Recognition Contrastive Learning
Code Code Available 1Cross-Modal Adapter for Text-Video Retrieval Nov 17, 2022 parameter-efficient fine-tuning Retrieval
Code Code Available 1Cross Modal Retrieval with Querybank Normalisation Dec 23, 2021 Cross-Modal Retrieval Metric Learning
Code Code Available 1An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling Sep 4, 2022 Fill Mask Optical Flow Estimation
Code Code Available 1DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization Jun 1, 2021 Question Answering Retrieval
Code Code Available 1Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval Oct 9, 2024 Retrieval Text Retrieval
Code Code Available 1Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning Oct 17, 2020 Retrieval Transfer Learning
Code Code Available 1A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval Aug 3, 2022 Data Augmentation Retrieval
Code Code Available 1Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval Dec 8, 2021 Action Localization Retrieval
Code Code Available 1In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval Sep 16, 2023 Retrieval Style Transfer
Code Code Available 1Temporal Context Aggregation for Video Retrieval with Contrastive Learning Aug 4, 2020 Contrastive Learning Representation Learning
Code Code Available 1Dense-Captioning Events in Videos May 2, 2017 Dense Captioning Retrieval
Code Code Available 1CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval Sep 21, 2021 Corpus Video Moment Retrieval Moment Retrieval
Code Code Available 1DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval Jan 19, 2024 Retrieval Video Retrieval
Code Code Available 1Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval Jan 1, 2022 Action Localization Retrieval
Code Code Available 1ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Apr 6, 2022 Retrieval Text to Video Retrieval
Code Code Available 1DiffusionRet: Generative Text-Video Retrieval with Diffusion Model Mar 17, 2023 Retrieval Video Retrieval
Code Code Available 1Condensed Movies: Story Based Retrieval with Contextual Embeddings May 8, 2020 Retrieval Text to Video Retrieval
Code Code Available 1Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval Dec 3, 2021 Ad-hoc video search feature selection
Code Code Available 13D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval Nov 10, 2022 Retrieval Self-Supervised Learning
Code Code Available 1