Phrase Grounding
Given an image and a corresponding caption, the Phrase Grounding task aims to ground each entity mentioned by a noun phrase in the caption to a region in the image.
Source: Phrase Grounding by Soft-Label Chain Conditional Random Field
Papers
Showing 1–10 of 88 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | GLIPv2 | R@1 | 87.7 | — | Unverified |
| 2 | FIBER-B | R@1 | 87.4 | — | Unverified |
| 3 | GLIP | R@1 | 87.1 | — | Unverified |
| 4 | PEVL | R@1 | 84.4 | — | Unverified |
| 5 | MDETR-ENB5 | R@1 | 84.3 | — | Unverified |
| 6 | DIGN | R@1 | 78.73 | — | Unverified |
| 7 | LCMCG | R@1 | 76.74 | — | Unverified |
| 8 | Soft-Label Chain CRF (SL-CCRF) | R@1 | 74.69 | — | Unverified |
| 9 | DDPN (ResNet-101) | R@1 | 73.3 | — | Unverified |
| 10 | VisualBERT | R@1 | 71.33 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | GBS Ensemble + 12-in-1 | Pointing Game Accuracy | 85.9 | — | Unverified |
| 2 | GbS Ensemble MS-COCO | Pointing Game Accuracy | 75.6 | — | Unverified |
| 3 | COCO_ELMo_PNASNet | Pointing Game Accuracy | 69.19 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Fiber-B | R@1 | 87.1 | — | Unverified |
| 2 | PEVL | R@1 | 84.1 | — | Unverified |
| 3 | VisualBERT | R@1 | 70.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | VG_BiLSTM_VGG | Pointing Game Accuracy | 62.76 | — | Unverified |
| 2 | GbS Ensemble MS-COCO | Pointing Game Accuracy | 58.21 | — | Unverified |
| 3 | MCB | Accuracy | 28.91 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | GbS VG | Pointing Game Accuracy | 55.91 | — | Unverified |
| 2 | VG_ELMo_PNASNet | Pointing Game Accuracy | 55.16 | — | Unverified |
| 3 | GbS Ensemble MS-COCO | Pointing Game Accuracy | 54.55 | — | Unverified |