SOTAVerified

Temporal Sentence Grounding

Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. For this task, different levels of supervision are used. 1) Weak supervision: video-level action category set; 2) Semi-weak supervision: video-level action category set, and action annotations at several timestamps; 3) Full supervision: Action category and action interval annotations of all actions in untrimmed videos.

Papers

Showing 2643 of 43 papers

TitleStatusHype
Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training0
Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining0
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach0
Learning to Focus on the Foreground for Temporal Sentence Grounding0
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding0
Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network0
Progressively Guide to Attend: An Iterative Alignment Framework for Temporal Sentence Grounding0
Reducing the Vision and Language Bias for Temporal Sentence Grounding0
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding0
Temporal Sentence Grounding in Videos: A Survey and Future Directions0
Learning Temporal Sentence Grounding From Narrated EgoVideosCode0
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in VideosCode0
Diversifying Query: Region-Guided Transformer for Temporal Sentence GroundingCode0
Temporal Sentence Grounding in Streaming VideosCode0
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge DistillationCode0
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoCode0
A Closer Look at Temporal Sentence Grounding in Videos: Dataset and MetricCode0
Transformer with Controlled Attention for Synchronous Motion CaptioningCode0
Show:102550
← PrevPage 2 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DeCafNetR1@0.747.55Unverified
2AdaFocus (Full, MViT-Charades-Pretrain-feature, MMN model)R1@0.738.6Unverified
3AdaFocus (Full, I3D-Charades-Pretrain-feature, MMN model)R1@0.735.6Unverified
4MMN (Full, MViT-K400-Pretrain-feature, evaluated by AdaFocus)R1@0.732.2Unverified
5MMN (Full, I3D-K400-Pretrain-feature, evaluated by AdaFocus)R1@0.729.8Unverified
6AdaFocus (Weak, MViT-Charades-Pretrain-feature, CPL model)R1@0.723.2Unverified
7AdaFocus (Weak, I3D-Charades-Pretrain-feature, CPL model)R1@0.722.4Unverified
8CPL (Weak, MViT-K400-Pretrain-feature, evaluated by AdaFocus)R1@0.721.8Unverified
9AdaFocus (Semi-weak, MViT-Charades-Pretrain-feature, D3G model)R1@0.721.8Unverified
10AdaFocus (Semi-weak, I3D-Charades-Pretrain-feature, D3G model)R1@0.721.1Unverified
#ModelMetricClaimedVerifiedStatus
1DeCafNet-100%R@1,IoU=0.323.2Unverified
2DeCafNet-50%R@1,IoU=0.321.29Unverified
3VSLNetR@1,IoU=0.311.7Unverified