An overview on the evaluated video retrieval tasks at TRECVID 2022 Jun 22, 2023 Ad-hoc video search Retrieval
Code Code Available 15 Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures Jul 27, 2023 Automatic Speech Recognition Contrastive Learning
Code Code Available 15 CLIP2Video: Mastering Video-Text Retrieval via Image CLIP Jun 21, 2021 Language Modeling Language Modelling
Code Code Available 15 CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval Apr 18, 2021 Retrieval Text Retrieval
Code Code Available 15 Florence: A New Foundation Model for Computer Vision Nov 22, 2021 Action Classification Action Recognition
Code Code Available 15 A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension May 5, 2023 Reading Comprehension Retrieval
Code Code Available 15 Clover: Towards A Unified Video-Language Alignment and Fusion Model Jul 16, 2022 Language Modeling Language Modelling
Code Code Available 15 HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training May 1, 2020 Language Modeling Language Modelling
Code Code Available 15 Align and Prompt: Video-and-Language Pre-training with Entity Prompts Dec 17, 2021 cross-modal alignment Entity Alignment
Code Code Available 15 Hierarchical Video-Moment Retrieval and Step-Captioning Mar 29, 2023 Information Retrieval Moment Retrieval
Code Code Available 15 AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant Nov 30, 2021 Question Answering Retrieval
Code Code Available 15 CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 15 Learning video retrieval models with relevance-aware online mining Mar 16, 2022 Multi-Instance Retrieval Retrieval
Code Code Available 15 Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling Feb 11, 2021 Question Answering Retrieval
Code Code Available 15 Marine Video Kit: A New Marine Video Dataset for Content-based Analysis and Retrieval Sep 23, 2022 Retrieval Video Retrieval
Code Code Available 15 AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Jun 16, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning Nov 1, 2020 Cross-Modal Retrieval Representation Learning
Code Code Available 15 In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval Sep 16, 2023 Retrieval Style Transfer
Code Code Available 15 Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval Jan 1, 2023 Knowledge Distillation Language Modelling
Code Code Available 15 ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Apr 6, 2022 Retrieval Text to Video Retrieval
Code Code Available 15 Contrastive Masked Autoencoders for Self-Supervised Video Hashing Nov 21, 2022 Decoder Retrieval
Code Code Available 15 Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions Nov 19, 2021 Retrieval Super-Resolution
Code Code Available 15 EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval Jul 23, 2024 Re-Ranking Retrieval
Code Code Available 15 Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning Sep 20, 2023 Contrastive Learning Retrieval
Code Code Available 15 Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss Sep 9, 2021 Mixture-of-Experts Retrieval
Code Code Available 15 COSA: Concatenated Sample Pretrained Vision-Language Foundation Model Jun 15, 2023 Form model
Code Code Available 15 LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling Jun 14, 2022 Decoder Language Modeling
Code Code Available 15 CoVR-2: Automatic Data Construction for Composed Video Retrieval Aug 28, 2023 Composed Image Retrieval (CoIR) Composed Video Retrieval (CoVR)
Code Code Available 15 Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval Dec 8, 2021 Action Localization Retrieval
Code Code Available 15 End-to-End Learning of Visual Representations from Uncurated Instructional Videos Dec 13, 2019 Action Localization Action Recognition
Code Code Available 15 Cross-Architecture Self-supervised Video Representation Learning May 26, 2022 Action Recognition Contrastive Learning
Code Code Available 15 Cross-Modal Adapter for Text-Video Retrieval Nov 17, 2022 parameter-efficient fine-tuning Retrieval
Code Code Available 15 Cross Modal Retrieval with Querybank Normalisation Dec 23, 2021 Cross-Modal Retrieval Metric Learning
Code Code Available 15 An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling Sep 4, 2022 Fill Mask Optical Flow Estimation
Code Code Available 15 DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization Jun 1, 2021 Question Answering Retrieval
Code Code Available 15 Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval Oct 9, 2024 Retrieval Text Retrieval
Code Code Available 15 Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data Oct 8, 2023 Action Recognition Continual Learning
Code Code Available 15 A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval Aug 3, 2022 Data Augmentation Retrieval
Code Code Available 15 C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval Oct 7, 2022 Knowledge Distillation Retrieval
Code Code Available 15 DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval Jun 24, 2021 Computational Efficiency Knowledge Distillation
Code Code Available 15 Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning Oct 17, 2020 Retrieval Transfer Learning
Code Code Available 15 Dual Encoding for Video Retrieval by Text Sep 10, 2020 Ad-hoc video search Retrieval
Code Code Available 15 Temporal Context Aggregation for Video Retrieval with Contrastive Learning Aug 4, 2020 Contrastive Learning Representation Learning
Code Code Available 15 DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval Jan 19, 2024 Retrieval Video Retrieval
Code Code Available 15 CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval Sep 21, 2021 Corpus Video Moment Retrieval Moment Retrieval
Code Code Available 15 Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation Jul 9, 2020 Few-Shot Image Classification Few-Shot Learning
Code Code Available 15 DiffusionRet: Generative Text-Video Retrieval with Diffusion Model Mar 17, 2023 Retrieval Video Retrieval
Code Code Available 15 Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Apr 1, 2021 Retrieval Text Retrieval
Code Code Available 15 Hysia: Serving DNN-Based Video-to-Retail Applications in Cloud Jun 9, 2020 GPU Video Retrieval
Code Code Available 15 Condensed Movies: Story Based Retrieval with Contextual Embeddings May 8, 2020 Retrieval Text to Video Retrieval
Code Code Available 15