Temporal Sentence Grounding

Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. For this task, different levels of supervision are used. 1) Weak supervision: video-level action category set; 2) Semi-weak supervision: video-level action category set, and action annotations at several timestamps; 3) Full supervision: Action category and action interval annotations of all actions in untrimmed videos.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 43 papers

Title	Date	Tasks	Status	Hype	Score
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models	Oct 4, 2024	Dense Video CaptioningSentence	CodeCode Available	2	5
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval	Sep 1, 2020	Moment RetrievalRetrieval	CodeCode Available	1	5
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation	Aug 8, 2023	Contrastive LearningSentence	CodeCode Available	1	5
Weakly Supervised Temporal Sentence Grounding With Gaussian-Based Contrastive Proposal Learning	Jan 1, 2022	Model OptimizationSentence	CodeCode Available	1	5
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos	May 22, 2025	Natural Language Moment RetrievalNatural Language Queries	CodeCode Available	1	5
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding	Sep 10, 2021	Metric LearningRepresentation Learning	CodeCode Available	1	5
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos	Nov 30, 2023	Moment RetrievalNatural Language Moment Retrieval	CodeCode Available	1	5
Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Supervised Temporal Video Grounding	Dec 27, 2023	SentenceTemporal Sentence Grounding	CodeCode Available	1	5
Span-based Localizing Network for Natural Language Video Localization	Apr 29, 2020	Temporal Sentence Grounding	CodeCode Available	1	5
Learning Temporal Sentence Grounding From Narrated EgoVideos	Oct 26, 2023	SentenceTemporal Sentence Grounding	CodeCode Available	0	5
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos	Oct 31, 2019	SentenceTemporal Sentence Grounding	CodeCode Available	0	5
Temporal Sentence Grounding in Streaming Videos	Aug 14, 2023	SentenceTemporal Sentence Grounding	CodeCode Available	0	5
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video	Jan 15, 2024	SentenceTemporal Sentence Grounding	CodeCode Available	0	5
A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric	Jan 22, 2021	BenchmarkingSentence	CodeCode Available	0	5
Transformer with Controlled Attention for Synchronous Motion Captioning	Sep 13, 2024	Action LocalizationAction Segmentation	CodeCode Available	0	5
Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding	May 31, 2024	AttributeMoment Queries	CodeCode Available	0	5
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation	Aug 7, 2023	Knowledge DistillationSentence	CodeCode Available	0	5
Temporal Sentence Grounding in Videos: A Survey and Future Directions	Jan 20, 2022	Moment RetrievalRetrieval	—Unverified	0	0
Towards Debiasing Temporal Sentence Grounding in Video	Nov 8, 2021	SentenceTemporal Sentence Grounding	—Unverified	0	0
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video	Mar 13, 2022	Language ModellingQuestion Answering	—Unverified	0	0
Tracking Objects and Activities with Attention for Temporal Sentence Grounding	Feb 21, 2023	SentenceTemporal Sentence Grounding	—Unverified	0	0
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding	May 6, 2023	SentenceTemporal Sentence Grounding	—Unverified	0	0
Video sentence grounding with temporally global textual knowledge	Apr 21, 2024	Contrastive LearningRetrieval	—Unverified	0	0
Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training	Jan 1, 2023	Data AugmentationSentence	—Unverified	0	0
Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining	May 10, 2025	Contrastive LearningSentence	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets Charades-STA Ego4D-Goalstep

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeCafNet	R1@0.7	47.55	—	Unverified
2	AdaFocus (Full, MViT-Charades-Pretrain-feature, MMN model)	R1@0.7	38.6	—	Unverified
3	AdaFocus (Full, I3D-Charades-Pretrain-feature, MMN model)	R1@0.7	35.6	—	Unverified
4	MMN (Full, MViT-K400-Pretrain-feature, evaluated by AdaFocus)	R1@0.7	32.2	—	Unverified
5	MMN (Full, I3D-K400-Pretrain-feature, evaluated by AdaFocus)	R1@0.7	29.8	—	Unverified
6	AdaFocus (Weak, MViT-Charades-Pretrain-feature, CPL model)	R1@0.7	23.2	—	Unverified
7	AdaFocus (Weak, I3D-Charades-Pretrain-feature, CPL model)	R1@0.7	22.4	—	Unverified
8	CPL (Weak, MViT-K400-Pretrain-feature, evaluated by AdaFocus)	R1@0.7	21.8	—	Unverified
9	AdaFocus (Semi-weak, MViT-Charades-Pretrain-feature, D3G model)	R1@0.7	21.8	—	Unverified
10	AdaFocus (Semi-weak, I3D-Charades-Pretrain-feature, D3G model)	R1@0.7	21.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DeCafNet-100%	R@1,IoU=0.3	23.2	—	Unverified
2	DeCafNet-50%	R@1,IoU=0.3	21.29	—	Unverified
3	VSLNet	R@1,IoU=0.3	11.7	—	Unverified