SOTAVerified

Video-to-image Affordance Grounding

Given a demonstration video V and a target image I, the goal of video-to-image affordance grounding predict an affordance heatmap over the target image according to the hand-interacted region in the video, accompanied by the affordance action (e.g., press, turn).

Papers

Showing 14 of 4 papers

TitleStatusHype
Affordance Grounding from Demonstration Video to Target ImageCode1
Learning Visual Affordance Grounding from Demonstration Videos0
Grounded Human-Object Interaction Hotspots from VideoCode0
Demo2Vec: Reasoning Object Affordances From Online Videos0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1HotspotKLD1.26Unverified
2HAG-Net (+Hand Box)KLD1.21Unverified
3AfformerKLD0.97Unverified