Video Visual Relation Detection
Video Visual Relation Detection (VidVRD) aims to detect instances of visual relations of interest in a video, where a visual relation instance is represented by a relation triplet with the trajectories of the subject and object. As compared to still images, videos provide a more natural set of features for detecting visual relations, such as the dynamic relations like “A-follow-B” and “A-towards-B”, and temporally changing relations like “A-chase-B” followed by “A-hold-B”. Yet, VidVRD is technically more challenging than ImgVRD due to the difficulties in accurate object tracking and diverse relation appearances in the video domain.
Papers
Showing 1–10 of 15 papers
No leaderboard results yet.