SOTAVerified

Natural Language Visual Grounding

Papers

Showing 2130 of 32 papers

TitleStatusHype
Panoptic Narrative GroundingCode1
A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial ExpressionsCode1
ALFWorld: Aligning Text and Embodied Environments for Interactive LearningCode1
Self-Monitoring Navigation Agent via Auxiliary Progress EstimationCode1
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday TasksCode1
Composing Pick-and-Place Tasks By Grounding LanguageCode0
Modularized Textual Grounding for Counterfactual ResilienceCode0
Grounding of Textual Phrases in Images by ReconstructionCode0
Robust Change CaptioningCode0
Searching for Ambiguous Objects in Videos using Relational Referring ExpressionsCode0
Show:102550
← PrevPage 3 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1UGround-V1-7BAccuracy (%)86.34Unverified
2Aguvis-7BAccuracy (%)83Unverified
3OS-Atlas-Base-7BAccuracy (%)82.47Unverified
4Aria-UIAccuracy (%)81.1Unverified
5Aguvis-G-7BAccuracy (%)81Unverified
6UGround-V1-2BAccuracy (%)77.67Unverified
7ShowUIAccuracy (%)75.1Unverified
8ShowUI-GAccuracy (%)75Unverified
9UGroundAccuracy (%)73.3Unverified
10OmniParserAccuracy (%)73Unverified