Video Grounding

Video grounding is the task of linking spoken language descriptions to specific video segments. In video grounding, the model is given a video and a natural language description, such as a sentence or a caption, and its goal is to identify the specific segment of the video that corresponds to the description. This can involve tasks such as localizing the objects or actions mentioned in the description within the video, or associating a specific time interval with the description.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–110 of 114 papers

Title	Date	Tasks	Status
End-to-End Dense Video Grounding via Parallel Regression	Sep 23, 2021	regressionSentence	—Unverified
On Pursuit of Designing Multi-modal Transformer for Video Grounding	Sep 13, 2021	AllDecoder	—Unverified
EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation	Sep 10, 2021	TranslationVideo Grounding	—Unverified
Support-Set Based Cross-Supervision for Video Grounding	Aug 24, 2021	Contrastive LearningVideo Grounding	—Unverified
Interventional Video Grounding with Dual Contrastive Learning	Jun 21, 2021	Causal InferenceContrastive Learning	CodeCode Available
Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding	Jun 20, 2021	Spatio-Temporal Video GroundingVideo Grounding	—Unverified
Cascaded Prediction Network via Segment Tree for Temporal Video Grounding	Jun 19, 2021	SentenceVideo Grounding	—Unverified
Parallel Attention Network with Sequence Matching for Video Grounding	May 18, 2021	Representation LearningVideo Grounding	—Unverified
Cross-Modal learning for Audio-Visual Video Parsing	Apr 3, 2021	Event DetectionMultiple Instance Learning	CodeCode Available
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos	Mar 23, 2021	Referring ExpressionReferring Expression Comprehension	—Unverified

Show:10 25 50

← PrevPage 11 of 12Next →

All datasets QVHighlights MAD

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	InternVideo2-6B	R@1,IoU=0.7	56.45	—	Unverified
2	InternVideo2-1B	R@1,IoU=0.7	54.45	—	Unverified
3	LLMEPET	R@1,IoU=0.7	49.94	—	Unverified
4	QD-DETR	R@1,IoU=0.7	44.98	—	Unverified
5	DiffusionVMR	R@1,IoU=0.7	44.49	—	Unverified
6	UMT	R@1,IoU=0.7	41.18	—	Unverified
7	Moment-DETR	R@1,IoU=0.7	33.02	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DeCafNet	R@1,IoU=0.1	13.25	—	Unverified
2	DenoiseLoc	R@1,IoU=0.1	11.59	—	Unverified