A CLIP-Hitchhiker's Guide to Long Video Retrieval May 17, 2022 Retrieval Video Retrieval
Code Code Available 1Learning to Retrieve Videos by Asking Questions May 11, 2022 AI Agent Retrieval
Code Code Available 0TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition May 4, 2022 Action Recognition Representation Learning
Code Code Available 1CoCa: Contrastive Captioners are Image-Text Foundation Models May 4, 2022 Action Classification Decoder
Code Code Available 1CenterCLIP: Token Clustering for Efficient Text-Video Retrieval May 2, 2022 Clustering Retrieval
Code Code Available 1Learn to Understand Negation in Video Retrieval Apr 30, 2022 Natural Language Queries Negation
Code Code Available 0Relevance-based Margin for Contrastively-trained Video Retrieval Models Apr 27, 2022 Multi-Instance Retrieval Natural Language Queries
Code Code Available 0MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval Apr 26, 2022 Action Recognition Retrieval
Code Code Available 1A Survey of Video-based Action Quality Assessment Apr 20, 2022 Action Quality Assessment Action Recognition
— Unverified 0Modality-Balanced Embedding for Video Retrieval Apr 18, 2022 Retrieval Text Matching
— Unverified 0COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval Apr 15, 2022 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA Apr 11, 2022 Retrieval Video Retrieval
Code Code Available 0Probabilistic Representations for Video Contrastive Learning Apr 8, 2022 Action Recognition Contrastive Learning
— Unverified 0Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions with Multi-Level Representations Apr 7, 2022 Contrastive Learning Denoising
— Unverified 0Temporal Alignment Networks for Long-term Video Apr 6, 2022 Action Recognition Action Segmentation
Code Code Available 1ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Apr 6, 2022 Retrieval Text to Video Retrieval
Code Code Available 1Learning Audio-Video Modalities from Image Captions Apr 1, 2022 Image Captioning Retrieval
— Unverified 0Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language Apr 1, 2022 Diversity Image Captioning
Code Code Available 0CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation Mar 31, 2022 Retrieval Video Captioning
— Unverified 0Controllable Augmentations for Video Representation Learning Mar 30, 2022 Action Recognition Contrastive Learning
— Unverified 0X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval Mar 28, 2022 Retrieval Text to Video Retrieval
Code Code Available 1FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks Mar 24, 2022 Action Recognition Retrieval
Code Code Available 0Learning video retrieval models with relevance-aware online mining Mar 16, 2022 Multi-Instance Retrieval Retrieval
Code Code Available 1Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval Mar 15, 2022 Question Answering Retrieval
Code Code Available 1Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data Mar 14, 2022 Articles Retrieval
Code Code Available 1All in One: Exploring Unified Video-Language Pre-training Mar 14, 2022 All Language Modelling
Code Code Available 2MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization Mar 14, 2022 Retrieval Text to Video Retrieval
— Unverified 0Disentangled Representation Learning for Text-Video Retrieval Mar 14, 2022 Representation Learning Retrieval
Code Code Available 1Live Laparoscopic Video Retrieval with Compressed Uncertainty Mar 8, 2022 Retrieval Video Retrieval
— Unverified 0VScript: Controllable Script Generation with Visual Presentation Mar 1, 2022 Dialogue Generation Retrieval
— Unverified 0NEWSKVQA: Knowledge-Aware News Video Question Answering Feb 8, 2022 Common Sense Reasoning Management
— Unverified 0Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval Feb 7, 2022 Contrastive Learning Quantization
Code Code Available 1Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval Jan 23, 2022 Representation Learning Retrieval
Code Code Available 1Self-supervised Video Representation Learning with Cascade Positive Retrieval Jan 20, 2022 Action Recognition Contrastive Learning
Code Code Available 0End-to-end Generative Pretraining for Multimodal Video Captioning Jan 20, 2022 Action Classification Decoder
— Unverified 0Bridging Video-text Retrieval with Multiple Choice Questions Jan 13, 2022 Action Recognition Linear evaluation
Code Code Available 1Multi-Query Video Retrieval Jan 10, 2022 Retrieval Video Retrieval
Code Code Available 1Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos? Jan 10, 2022 Information Retrieval Retrieval
— Unverified 0Sign Language Video Retrieval with Free-Form Textual Queries Jan 7, 2022 Form Retrieval
— Unverified 0Sound and Visual Representation Learning with Multiple Pretraining Tasks Jan 4, 2022 Incremental Learning Representation Learning
— Unverified 0Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval Jan 1, 2022 Action Localization Retrieval
Code Code Available 1Video Joint Modelling Based on Hierarchical Transformer for Co-summarization Dec 27, 2021 Retrieval Supervised Video Summarization
Code Code Available 1Cross Modal Retrieval with Querybank Normalisation Dec 23, 2021 Cross-Modal Retrieval Metric Learning
Code Code Available 1Align and Prompt: Video-and-Language Pre-training with Entity Prompts Dec 17, 2021 cross-modal alignment Entity Alignment
Code Code Available 1Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos Dec 15, 2021 Retrieval Triplet
Code Code Available 0Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity Dec 11, 2021 Action Localization Action Recognition
— Unverified 0Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval Dec 8, 2021 Action Localization Retrieval
Code Code Available 1Prompting Visual-Language Models for Efficient Video Understanding Dec 8, 2021 Action Recognition Language Modelling
Code Code Available 1Cross-modal Manifold Cutmix for Self-supervised Video Representation Learning Dec 7, 2021 Action Recognition Representation Learning
— Unverified 0Time-Equivariant Contrastive Video Representation Learning Dec 7, 2021 Action Recognition Contrastive Learning
— Unverified 0