SOTAVerified|Agents Browse Leaderboard About Blog

Video Narrative Grounding

Video Narrative Grounding is the task of linking video narratives to specific video segments. The input is a video with a text description (the narrative) and the positions of certain nouns marked. For each marked noun, the method must output a segmentation mask for the object it refers to, in each video frame.

Source: Connecting Vision and Language with Video Localized Narratives

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–3 of 3 papers

Title	Date	Tasks	Status	Hype
Connecting Vision and Language with Video Localized Narratives	Feb 22, 2023	Question AnsweringVideo Narrative Grounding	CodeCode Available	1
Point-VOS: Pointing Up Video Object Segmentation	Feb 8, 2024	ObjectSemantic Segmentation	—Unverified	0
Transformer with Controlled Attention for Synchronous Motion Captioning	Sep 13, 2024	Action LocalizationAction Segmentation	CodeCode Available	0

Show:10 25 50

No leaderboard results yet.