Can I Trust Your Answer? Visually Grounded Video Question Answering Sep 4, 2023 Grounded Video Question Answering Question Answering
Code Code Available 1DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection Aug 29, 2023 Denoising Highlight Detection
— Unverified 0Knowing Where to Focus: Event-aware Transformer for Video Grounding Aug 14, 2023 Moment Queries Sentence
Code Code Available 1ViGT: Proposal-free Video Grounding with Learnable Token in Transformer Aug 11, 2023 Feature Correlation regression
— Unverified 0G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory Jul 26, 2023 Contrastive Learning Video Grounding
— Unverified 0No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection Jul 20, 2023 Boundary Detection Video Grounding
— Unverified 0Dense Video Object Captioning from Disjoint Supervision Jun 20, 2023 Object Sentence
Code Code Available 0Boundary-Denoising for Video Activity Localization Apr 6, 2023 Action Detection Decoder
Code Code Available 0Query-Dependent Video Representation for Moment Retrieval and Highlight Detection Mar 24, 2023 Highlight Detection Moment Retrieval
Code Code Available 2Generation-Guided Multi-Level Unified Network for Video Grounding Mar 14, 2023 Video Grounding
— Unverified 0Text-Visual Prompting for Efficient 2D Temporal Video Grounding Mar 9, 2023 Sentence Video Grounding
Code Code Available 1Localizing Moments in Long Video Via Multimodal Guidance Feb 26, 2023 Natural Language Moment Retrieval Natural Language Visual Grounding
Code Code Available 1MINOTAUR: Multi-task Video Grounding From Multimodal Queries Feb 16, 2023 Action Detection Sentence
Code Code Available 0Exploiting Auxiliary Caption for Video Grounding Jan 15, 2023 Contrastive Learning Dense Video Captioning
— Unverified 0WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding Jan 1, 2023 Contrastive Learning Spatio-Temporal Video Grounding
— Unverified 0Iterative Proposal Refinement for Weakly-Supervised Video Grounding Jan 1, 2023 Sentence Video Grounding
— Unverified 0Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding Jan 1, 2023 Object Spatio-Temporal Video Grounding
— Unverified 0Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language Jan 1, 2023 Question Answering Self-Supervised Learning
Code Code Available 0Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding Jan 1, 2023 Decoder Sentence
— Unverified 0A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge Nov 16, 2022 Action Localization Natural Language Queries
Code Code Available 0Language-free Training for Zero-shot Video Grounding Oct 24, 2022 Video Grounding
— Unverified 0Weakly-Supervised Temporal Article Grounding Oct 22, 2022 All Articles
Code Code Available 1Graph2Vid: Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization Oct 10, 2022 Video Grounding
— Unverified 0On the Effects of Video Grounding on Language Models Oct 1, 2022 Image Captioning Question Answering
— Unverified 0Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding Sep 27, 2022 Decoder Spatio-Temporal Video Grounding
Code Code Available 1Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding Sep 26, 2022 Benchmarking Natural Language Queries
Code Code Available 0CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding Sep 22, 2022 Contrastive Learning Video Grounding
Code Code Available 1Video-Guided Curriculum Learning for Spoken Video Grounding Sep 1, 2022 Video Grounding
Code Code Available 0Exploiting Feature Diversity for Make-up Temporal Video Grounding Aug 12, 2022 Diversity Video Grounding
— Unverified 0Team PKU-WICT-MIPL PIC Makeup Temporal Video Grounding Challenge 2022 Technical Report Jul 6, 2022 Sentence Temporal Localization
— Unverified 0STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding Jul 6, 2022 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Gaussian Kernel-based Cross Modal Network for Spatio-Temporal Video Grounding Jul 2, 2022 Spatio-Temporal Video Grounding Video Grounding
— Unverified 0Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding Apr 18, 2022 Action Recognition Animal Action Recognition
Code Code Available 1Position-aware Location Regression Network for Temporal Video Grounding Apr 12, 2022 Position regression
— Unverified 0TubeDETR: Spatio-Temporal Video Grounding with Transformers Mar 30, 2022 Decoder Language-Based Temporal Localization
Code Code Available 1UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection Mar 23, 2022 Decoder Highlight Detection
Code Code Available 2End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding Mar 15, 2022 Descriptive Representation Learning
— Unverified 0Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding Mar 8, 2022 Contrastive Learning Sentence
— Unverified 0Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos Jan 25, 2022 Natural Language Queries Sentence
Code Code Available 1Unsupervised Temporal Video Grounding with Deep Semantic Clustering Jan 14, 2022 Clustering Sentence
— Unverified 0Semi-Supervised Video Paragraph Grounding With Contrastive Encoder Jan 1, 2022 Sentence Video Grounding
— Unverified 0Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation Jan 1, 2022 Object Referring Expression Segmentation
— Unverified 0LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach Dec 19, 2021 Inductive Bias Video Grounding
— Unverified 0Detecting Moments and Highlights in Videos via Natural Language Queries Dec 1, 2021 Decoder Moment Retrieval
Code Code Available 1End-to-End Dense Video Grounding via Parallel Regression Sep 23, 2021 regression Sentence
— Unverified 0On Pursuit of Designing Multi-modal Transformer for Video Grounding Sep 13, 2021 All Decoder
— Unverified 0Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding Sep 10, 2021 Metric Learning Representation Learning
Code Code Available 1EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation Sep 10, 2021 Translation Video Grounding
— Unverified 0Support-Set Based Cross-Supervision for Video Grounding Aug 24, 2021 Contrastive Learning Video Grounding
— Unverified 0VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer Jul 6, 2021 Image Retrieval Knowledge Distillation
Code Code Available 1