HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training Dec 30, 2022 cross-modal alignment TGIF-Action
— Unverified 0You were saying? - Spoken Language in the V3C Dataset Dec 15, 2022 Retrieval Video Retrieval
Code Code Available 0Contextual Explainable Video Representation: Human Perception-based Understanding Dec 12, 2022 Action Detection Action Recognition
Code Code Available 0VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners Dec 9, 2022 Question Answering Retrieval
— Unverified 0Masked Contrastive Pre-Training for Efficient Video-Text Retrieval Dec 2, 2022 Image-text Retrieval Retrieval
— Unverified 0Renmin University of China at TRECVID 2022: Improving Video Search by Feature Fusion and Negation Understanding Nov 28, 2022 Ad-hoc video search Negation
— Unverified 0Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval Nov 21, 2022 All Retrieval
Code Code Available 0SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training Nov 21, 2022 cross-modal alignment GPU
— Unverified 0A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset Nov 19, 2022 Common Sense Reasoning Graph Embedding
— Unverified 0CLOP: Video-and-Language Pre-Training with Knowledge Regularizations Nov 7, 2022 Contrastive Learning Retrieval
— Unverified 0LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling Oct 21, 2022 Language Modeling Language Modelling
— Unverified 0Efficient Cross-Modal Video Retrieval with Meta-Optimized Frames Oct 16, 2022 Bilevel Optimization Retrieval
Code Code Available 0Semantic Video Moments Retrieval at Scale: A New Task and a Baseline Oct 15, 2022 Retrieval Video Retrieval
— Unverified 0RaP: Redundancy-aware Video-language Pre-training for Text-Video Retrieval Oct 13, 2022 Contrastive Learning Retrieval
Code Code Available 0Learning to Locate Visual Answer in Video Corpus Using Question Oct 11, 2022 Contrastive Learning Language Modelling
Code Code Available 0Contrastive Video-Language Learning with Fine-grained Frame Sampling Oct 10, 2022 Question Answering Representation Learning
— Unverified 0Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks Oct 10, 2022 Retrieval Text to Video Retrieval
— Unverified 0ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval Oct 9, 2022 Retrieval Sentence
Code Code Available 0Event Extraction in Video Transcripts Oct 1, 2022 Articles Event Extraction
— Unverified 0Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval Sep 27, 2022 Cross-Modal Retrieval Retrieval
— Unverified 0Multi-Granularity Graph Pooling for Video-based Person Re-Identification Sep 23, 2022 Node Clustering Person Re-Identification
— Unverified 0Pose-Aided Video-based Person Re-Identification via Recurrent Graph Convolutional Network Sep 23, 2022 Person Re-Identification Retrieval
— Unverified 0Semi-automatic Data Annotation System for Multi-Target Multi-Camera Vehicle Tracking Sep 20, 2022 Retrieval Video Retrieval
— Unverified 0Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising Sep 19, 2022 Image Retrieval Retrieval
— Unverified 0OmniVL:One Foundation Model for Image-Language and Video-Language Tasks Sep 15, 2022 Action Classification Action Recognition
— Unverified 0Temporal Contrastive Learning with Curriculum Sep 2, 2022 Action Recognition Contrastive Learning
— Unverified 0MuMUR : Multilingual Multimodal Universal Retrieval Aug 24, 2022 Image Retrieval Machine Translation
— Unverified 0STAR-GNN: Spatial-Temporal Video Representation for Content-based Retrieval Aug 15, 2022 Graph Neural Network Representation Learning
— Unverified 0Motion Sensitive Contrastive Learning for Self-supervised Video Representation Aug 12, 2022 Contrastive Learning Representation Learning
— Unverified 0QSAM-Net: Rain streak removal by quaternion neural network with self-attention module Aug 8, 2022 Benchmarking object-detection
— Unverified 0GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning Jul 20, 2022 Action Recognition Clustering
Code Code Available 0LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval Jul 11, 2022 Representation Learning Retrieval
— Unverified 0Robustness Analysis of Video-Language Models Against Visual and Language Perturbations Jul 5, 2022 Language Modeling Language Modelling
Code Code Available 0Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 Jun 29, 2022 Multi-Instance Retrieval Retrieval
Code Code Available 0RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video Retrieval Jun 26, 2022 Mixture-of-Experts Retrieval
Code Code Available 0Semantic Role Aware Correlation Transformer for Text to Video Retrieval Jun 26, 2022 Retrieval Text to Video Retrieval
Code Code Available 0VRAG: Region Attention Graphs for Content-Based Video Retrieval May 18, 2022 Retrieval Video Retrieval
— Unverified 0Learning to Retrieve Videos by Asking Questions May 11, 2022 AI Agent Retrieval
Code Code Available 0Learn to Understand Negation in Video Retrieval Apr 30, 2022 Natural Language Queries Negation
Code Code Available 0Relevance-based Margin for Contrastively-trained Video Retrieval Models Apr 27, 2022 Multi-Instance Retrieval Natural Language Queries
Code Code Available 0A Survey of Video-based Action Quality Assessment Apr 20, 2022 Action Quality Assessment Action Recognition
— Unverified 0Modality-Balanced Embedding for Video Retrieval Apr 18, 2022 Retrieval Text Matching
— Unverified 0COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval Apr 15, 2022 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA Apr 11, 2022 Retrieval Video Retrieval
Code Code Available 0Probabilistic Representations for Video Contrastive Learning Apr 8, 2022 Action Recognition Contrastive Learning
— Unverified 0Tencent Text-Video Retrieval: Hierarchical Cross-Modal Interactions with Multi-Level Representations Apr 7, 2022 Contrastive Learning Denoising
— Unverified 0Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language Apr 1, 2022 Diversity Image Captioning
Code Code Available 0Learning Audio-Video Modalities from Image Captions Apr 1, 2022 Image Captioning Retrieval
— Unverified 0CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation Mar 31, 2022 Retrieval Video Captioning
— Unverified 0Controllable Augmentations for Video Representation Learning Mar 30, 2022 Action Recognition Contrastive Learning
— Unverified 0