Video Grounding

Video grounding is the task of linking spoken language descriptions to specific video segments. In video grounding, the model is given a video and a natural language description, such as a sentence or a caption, and its goal is to identify the specific segment of the video that corresponds to the description. This can involve tasks such as localizing the objects or actions mentioned in the description within the video, or associating a specific time interval with the description.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 71–80 of 114 papers

Title	Date	Tasks	Status
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer	Aug 11, 2023	Feature Correlationregression	—Unverified
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory	Jul 26, 2023	Contrastive LearningVideo Grounding	—Unverified
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection	Jul 20, 2023	Boundary DetectionVideo Grounding	—Unverified
Dense Video Object Captioning from Disjoint Supervision	Jun 20, 2023	ObjectSentence	CodeCode Available
Boundary-Denoising for Video Activity Localization	Apr 6, 2023	Action DetectionDecoder	CodeCode Available
Generation-Guided Multi-Level Unified Network for Video Grounding	Mar 14, 2023	Video Grounding	—Unverified
MINOTAUR: Multi-task Video Grounding From Multimodal Queries	Feb 16, 2023	Action DetectionSentence	CodeCode Available
Exploiting Auxiliary Caption for Video Grounding	Jan 15, 2023	Contrastive LearningDense Video Captioning	—Unverified
WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding	Jan 1, 2023	Contrastive LearningSpatio-Temporal Video Grounding	—Unverified
Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding	Jan 1, 2023	ObjectSpatio-Temporal Video Grounding	—Unverified

Show:10 25 50

← PrevPage 8 of 12Next →

All datasets QVHighlights MAD

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	InternVideo2-6B	R@1,IoU=0.7	56.45	—	Unverified
2	InternVideo2-1B	R@1,IoU=0.7	54.45	—	Unverified
3	LLMEPET	R@1,IoU=0.7	49.94	—	Unverified
4	QD-DETR	R@1,IoU=0.7	44.98	—	Unverified
5	DiffusionVMR	R@1,IoU=0.7	44.49	—	Unverified
6	UMT	R@1,IoU=0.7	41.18	—	Unverified
7	Moment-DETR	R@1,IoU=0.7	33.02	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DeCafNet	R@1,IoU=0.1	13.25	—	Unverified
2	DenoiseLoc	R@1,IoU=0.1	11.59	—	Unverified