Referring Expression Segmentation
The task aims at labeling the pixels of an image or video that represent an object instance referred by a linguistic expression. In particular, the referring expression (RE) must allow the identification of an individual object in a discourse or scene (the referent). REs unambiguously identify the target instance.
Papers
Showing 1–10 of 145 papers
All datasetsRefCoCo valRefCOCO testARefer-YouTube-VOS (2021 public validation)RefCOCO+ test BA2D SentencesRefCOCOg-valJ-HMDBDAVIS 2017 (val)RefCOCOg-testRefCOCO testBPhraseCutRefCOCO
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | DeRIS-L | Overall IoU | 85.41 | — | Unverified |
| 2 | HyperSeg | Overall IoU | 84.8 | — | Unverified |
| 3 | PSALM | Overall IoU | 83.6 | — | Unverified |
| 4 | MLCD-Seg-7B | Overall IoU | 83.6 | — | Unverified |
| 5 | HIPIE | Overall IoU | 82.8 | — | Unverified |
| 6 | EVF-SAM | Overall IoU | 82.4 | — | Unverified |
| 7 | UNINEXT-H | Overall IoU | 82.19 | — | Unverified |
| 8 | UniLSeg-100 | Overall IoU | 81.74 | — | Unverified |
| 9 | DETRIS | Overall IoU | 81 | — | Unverified |
| 10 | C3VG | Overall IoU | 80.89 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | DeRIS-L | Overall IoU | 86.49 | — | Unverified |
| 2 | HyperSeg | Overall IoU | 85.7 | — | Unverified |
| 3 | MLCD-Seg-7B | Overall IoU | 85.3 | — | Unverified |
| 4 | EVF-SAM | Overall IoU | 84.2 | — | Unverified |
| 5 | HyperSeg | Overall IoU | 83.5 | — | Unverified |
| 6 | C3VG | Overall IoU | 83.18 | — | Unverified |
| 7 | MLCD-Seg-7B | Overall IoU | 82.9 | — | Unverified |
| 8 | DeRIS-L | Overall IoU | 82.34 | — | Unverified |
| 9 | DETRIS | Overall IoU | 81.9 | — | Unverified |
| 10 | MaskRIS (Swin-B, combined DB) | Overall IoU | 80.64 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MPG-SAM 2 | J&F | 73.9 | — | Unverified |
| 2 | VRS-HQ (Chat-UniVi-13B) | J&F | 71 | — | Unverified |
| 3 | GLEE-Pro | J&F | 70.6 | — | Unverified |
| 4 | UNINEXT-H | J&F | 70.1 | — | Unverified |
| 5 | ReferDINO (Swin-B) | J&F | 69.3 | — | Unverified |
| 6 | MUTR | J&F | 68.4 | — | Unverified |
| 7 | VLP (VLMo-L) | J&F | 67.6 | — | Unverified |
| 8 | UniRef-L (Swin-L) | J&F | 67.4 | — | Unverified |
| 9 | HTR (Pre-training) | J&F | 67.1 | — | Unverified |
| 10 | DsHmp (Video-Swin-Base) | J&F | 67.1 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | DeRIS-L | Mean IoU | 78.59 | — | Unverified |
| 2 | MLCD-Seg-7B | Overall IoU | 75.6 | — | Unverified |
| 3 | HyperSeg | Overall IoU | 75.2 | — | Unverified |
| 4 | EVF-SAM | Overall IoU | 71.9 | — | Unverified |
| 5 | DETRIS | Overall IoU | 70.2 | — | Unverified |
| 6 | C3VG | Overall IoU | 68.95 | — | Unverified |
| 7 | UniLSeg-100 | Overall IoU | 68.15 | — | Unverified |
| 8 | UniLSeg-20 | Overall IoU | 66.99 | — | Unverified |
| 9 | UNINEXT-H | Overall IoU | 66.22 | — | Unverified |
| 10 | GROUNDHOG | Overall IoU | 64.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | HINet | IoU overall | 0.68 | — | Unverified |
| 2 | RefVOS | IoU overall | 0.67 | — | Unverified |
| 3 | ClawCraneNet | IoU overall | 0.64 | — | Unverified |
| 4 | CMSA+CFSA | IoU overall | 0.62 | — | Unverified |
| 5 | RefVOS | IoU overall | 0.6 | — | Unverified |
| 6 | SgMg (Video-Swin-B) | AP | 0.59 | — | Unverified |
| 7 | SOC (Video-Swin-B) | AP | 0.57 | — | Unverified |
| 8 | ReferFormer (Video-Swin-B) | AP | 0.55 | — | Unverified |
| 9 | SOC (Video-Swin-T) | AP | 0.5 | — | Unverified |
| 10 | MANET | AP | 0.47 | — | Unverified |