Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding Nov 25, 2024 Dense Video Captioning Transfer Learning
— Unverified 0SimBase: A Simple Baseline for Temporal Video Grounding Nov 12, 2024 Video Grounding
— Unverified 0SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses Aug 3, 2024 Natural Language Queries Video Grounding
— Unverified 0Multi-sentence Video Grounding for Long Video Generation Jul 18, 2024 Moment Retrieval Retrieval
— Unverified 0Described Spatial-Temporal Video Detection Jul 8, 2024 Multi-class Classification Temporal Localization
— Unverified 0AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding Jun 11, 2024 regression Video Grounding
— Unverified 0Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network Jun 7, 2024 Depth Estimation Depth Prediction
— Unverified 0Artemis: Towards Referential Understanding in Complex Videos Jun 1, 2024 Text Summarization Video Grounding
Code Code Available 0Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition May 7, 2024 Large Language Model Multimodal Large Language Model
— Unverified 0SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding Apr 1, 2024 Mamba State Space Models
— Unverified 0Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding Mar 21, 2024 Video Grounding
Code Code Available 0VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding Jan 1, 2024 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding Dec 31, 2023 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding Dec 21, 2023 Domain Adaptation Unsupervised Domain Adaptation
— Unverified 0LLM4VG: Large Language Models Evaluation for Video Grounding Dec 21, 2023 Image Captioning Video Grounding
— Unverified 0Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval Dec 12, 2023 Contrastive Learning Moment Retrieval
Code Code Available 0EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language Model Dec 5, 2023 Boundary Detection Language Modeling
— Unverified 0Exploring Iterative Refinement with Diffusion Models for Video Grounding Oct 26, 2023 Sentence Video Grounding
Code Code Available 0Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding Sep 12, 2023 Sentence text similarity
Code Code Available 0DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection Aug 29, 2023 Denoising Highlight Detection
— Unverified 0ViGT: Proposal-free Video Grounding with Learnable Token in Transformer Aug 11, 2023 Feature Correlation regression
— Unverified 0G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory Jul 26, 2023 Contrastive Learning Video Grounding
— Unverified 0No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection Jul 20, 2023 Boundary Detection Video Grounding
— Unverified 0Dense Video Object Captioning from Disjoint Supervision Jun 20, 2023 Object Sentence
Code Code Available 0Boundary-Denoising for Video Activity Localization Apr 6, 2023 Action Detection Decoder
Code Code Available 0Generation-Guided Multi-Level Unified Network for Video Grounding Mar 14, 2023 Video Grounding
— Unverified 0MINOTAUR: Multi-task Video Grounding From Multimodal Queries Feb 16, 2023 Action Detection Sentence
Code Code Available 0Exploiting Auxiliary Caption for Video Grounding Jan 15, 2023 Contrastive Learning Dense Video Captioning
— Unverified 0WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding Jan 1, 2023 Contrastive Learning Spatio-Temporal Video Grounding
— Unverified 0Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding Jan 1, 2023 Object Spatio-Temporal Video Grounding
— Unverified 0Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language Jan 1, 2023 Question Answering Self-Supervised Learning
Code Code Available 0Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding Jan 1, 2023 Decoder Sentence
— Unverified 0Iterative Proposal Refinement for Weakly-Supervised Video Grounding Jan 1, 2023 Sentence Video Grounding
— Unverified 0A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge Nov 16, 2022 Action Localization Natural Language Queries
Code Code Available 0Language-free Training for Zero-shot Video Grounding Oct 24, 2022 Video Grounding
— Unverified 0Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization Oct 10, 2022 Video Grounding
— Unverified 0On the Effects of Video Grounding on Language Models Oct 1, 2022 Image Captioning Question Answering
— Unverified 0Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding Sep 26, 2022 Benchmarking Natural Language Queries
Code Code Available 0Video-Guided Curriculum Learning for Spoken Video Grounding Sep 1, 2022 Video Grounding
Code Code Available 0Exploiting Feature Diversity for Make-up Temporal Video Grounding Aug 12, 2022 Diversity Video Grounding
— Unverified 0Team PKU-WICT-MIPL PIC Makeup Temporal Video Grounding Challenge 2022 Technical Report Jul 6, 2022 Sentence Temporal Localization
— Unverified 0STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding Jul 6, 2022 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Gaussian Kernel-based Cross Modal Network for Spatio-Temporal Video Grounding Jul 2, 2022 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Position-aware Location Regression Network for Temporal Video Grounding Apr 12, 2022 Position regression
— Unverified 0End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding Mar 15, 2022 Descriptive Representation Learning
— Unverified 0Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding Mar 8, 2022 Contrastive Learning Sentence
— Unverified 0Unsupervised Temporal Video Grounding with Deep Semantic Clustering Jan 14, 2022 Clustering Sentence
— Unverified 0Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation Jan 1, 2022 Object Referring Expression Segmentation
— Unverified 0Semi-Supervised Video Paragraph Grounding With Contrastive Encoder Jan 1, 2022 Sentence Video Grounding
— Unverified 0LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach Dec 19, 2021 Inductive Bias Video Grounding
— Unverified 0