SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 221230 of 364 papers

TitleStatusHype
Scene-Intuitive Agent for Remote Embodied Visual Grounding0
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos0
OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene GroundingCode1
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement LearningCode1
Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network0
Unifying Vision-and-Language Tasks via Text GenerationCode1
Visual Question Answering based on Local-Scene-Aware Referring Expression Generation0
TRAR: Routing the Attention Spans in Transformer for Visual Question AnsweringCode1
MDETR - Modulated Detection for End-to-End Multi-Modal UnderstandingCode2
Language Controls More Than Top-Down Attention: Modulating Bottom-Up Visual Processing with Referring Expressions0
Show:102550
← PrevPage 23 of 37Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified