Natural Language Visual Grounding
Papers
Showing 31–32 of 32 papers
| Title | Status | Hype |
|---|---|---|
| Composing Pick-and-Place Tasks By Grounding Language | Code | 0 |
| Robust Change Captioning | Code | 0 |
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UGround-V1-7B | Accuracy (%) | 86.34 | — | Unverified |
| 2 | Aguvis-7B | Accuracy (%) | 83 | — | Unverified |
| 3 | OS-Atlas-Base-7B | Accuracy (%) | 82.47 | — | Unverified |
| 4 | Aria-UI | Accuracy (%) | 81.1 | — | Unverified |
| 5 | Aguvis-G-7B | Accuracy (%) | 81 | — | Unverified |
| 6 | UGround-V1-2B | Accuracy (%) | 77.67 | — | Unverified |
| 7 | ShowUI | Accuracy (%) | 75.1 | — | Unverified |
| 8 | ShowUI-G | Accuracy (%) | 75 | — | Unverified |
| 9 | UGround | Accuracy (%) | 73.3 | — | Unverified |
| 10 | OmniParser | Accuracy (%) | 73 | — | Unverified |