MAGMaR Shared Task System Description: Video Retrieval with OmniEmbed Jun 11, 2025 Retrieval Video Retrieval
— Unverified 0Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval Jun 11, 2025 Retrieval Text to Video Retrieval
— Unverified 0From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos Jun 5, 2025 Action Classification Composed Video Retrieval (CoVR)
Code Code Available 0Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review May 29, 2025 Retrieval Text to Video Retrieval
— Unverified 0Learning World Models for Interactive Video Generation May 28, 2025 In-Context Learning Retrieval
— Unverified 0A Challenge to Build Neuro-Symbolic Video Agents May 20, 2025 Scene Classification Video Retrieval
Code Code Available 0LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts May 20, 2025 Caption Generation Retrieval
Code Code Available 1Video-GPT via Next Clip Diffusion May 18, 2025 Denoising Image Animation
Code Code Available 1Contrastive Alignment with Semantic Gap-Aware Corrections in Text-Video Retrieval May 18, 2025 Contrastive Learning Retrieval
Code Code Available 0CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture May 3, 2025 Autonomous Driving Benchmarking
— Unverified 0Empowering Agentic Video Analytics Systems with Video Language Models May 1, 2025 Knowledge Graphs RAG
— Unverified 0ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams Apr 21, 2025 Informativeness Low-latency processing
Code Code Available 0Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval Apr 17, 2025 Partially Relevant Video Retrieval Retrieval
— Unverified 0Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering Apr 15, 2025 Partially Relevant Video Retrieval Retrieval
Code Code Available 0Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking Apr 11, 2025 Moment Retrieval Question Answering
— Unverified 0TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval Apr 7, 2025 Contrastive Learning Retrieval
Code Code Available 0Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval Apr 2, 2025 cross-modal alignment Retrieval
— Unverified 0Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval Mar 24, 2025 Retrieval Text to Video Retrieval
— Unverified 0Enhancing Subsequent Video Retrieval via Vision-Language Models (VLMs) Mar 21, 2025 Representation Learning Retrieval
Code Code Available 0Long-VMNet: Accelerating Long-Form Video Understanding via Fixed Memory Mar 17, 2025 Form GPU
— Unverified 0StableFusion: Continual Video Retrieval via Frame Adaptation Mar 13, 2025 Continual Learning Mixture-of-Experts
Code Code Available 1Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model Mar 12, 2025 AudioCaps Contrastive Learning
— Unverified 0Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions Mar 7, 2025 Retrieval Video Retrieval
— Unverified 0LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning Mar 4, 2025 Contrastive Learning Image-text Retrieval
— Unverified 0Composed Multi-modal Retrieval: A Survey of Approaches and Applications Mar 3, 2025 Cross-Modal Retrieval Data Augmentation
Code Code Available 2Learning to Generate Long-term Future Narrations Describing Activities of Daily Living Mar 3, 2025 Action Anticipation Decision Making
— Unverified 0TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba Feb 21, 2025 image-classification Image Classification
— Unverified 0Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos Feb 11, 2025 Contrastive Learning Image Retrieval
— Unverified 0VideoRoPE: What Makes for Good Video Rotary Position Embedding? Feb 7, 2025 Hallucination Position
Code Code Available 3HORUS: Multimodal Large Language Models Framework for Video Retrieval at VBS 2025 Jan 1, 2025 Image Retrieval Retrieval
— Unverified 0CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval Dec 31, 2024 Retrieval Text Retrieval
— Unverified 0Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
Code Code Available 0PolySmart @ TRECVid 2024 Medical Video Question Answering Dec 20, 2024 Question Answering Retrieval
— Unverified 0Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning Dec 18, 2024 Moment Retrieval Multi-Task Learning
— Unverified 0Gramian Multimodal Representation Learning and Alignment Dec 16, 2024 Contrastive Learning Representation Learning
Code Code Available 2Generative Semantic Communication: Architectures, Technologies, and Applications Dec 11, 2024 Retrieval Semantic Communication
— Unverified 0Multimodal Contextualized Support for Enhancing Video Retrieval System Dec 10, 2024 object-detection Object Detection
— Unverified 0Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Nov 20, 2024 GPU MME
Code Code Available 3ContextIQ: A Multimodal Expert-Based Video Retrieval System for Contextual Advertising Oct 29, 2024 Retrieval Text to Video Retrieval
Code Code Available 0Generating Signed Language Instructions in Large-Scale Dialogue Systems Oct 17, 2024 Retrieval Text Generation
Code Code Available 0MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval Oct 15, 2024 Descriptive Retrieval
— Unverified 0Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval Oct 9, 2024 Retrieval Text Retrieval
Code Code Available 1VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models Oct 1, 2024 Hallucination text similarity
— Unverified 0TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm Sep 30, 2024 Retrieval Video Retrieval
Code Code Available 0Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding Sep 29, 2024 Diversity Question Answering
— Unverified 0Unfolding Videos Dynamics via Taylor Expansion Sep 4, 2024 Action Detection Action Recognition
— Unverified 0TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval Sep 2, 2024 GPU Retrieval
Code Code Available 1Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets Sep 2, 2024 Video Alignment Video Editing
— Unverified 0T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval Aug 21, 2024 Retrieval Video Retrieval
Code Code Available 1MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval Aug 20, 2024 Mamba Natural Language Queries
Code Code Available 1