Visual Reasoning
Ability to understand actions and reasoning associated with any visual images
Papers
Showing 1–10 of 698 papers
All datasetsWinogroundNLVR2 DevNLVR2 TestCLEVRERBongard-OpenWorldWinoGAViLVSRPHYRE-1B-CrossPHYRE-1B-WithinVASRIRFL: Image Recognition of Figurative LanguageNLVR
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | BEiT-3 | Accuracy | 92.58 | — | Unverified |
| 2 | X2-VLM (large) | Accuracy | 89.4 | — | Unverified |
| 3 | XFM (base) | Accuracy | 88.4 | — | Unverified |
| 4 | CoCa | Accuracy | 87 | — | Unverified |
| 5 | X2-VLM (base) | Accuracy | 87 | — | Unverified |
| 6 | VLMo | Accuracy | 86.86 | — | Unverified |
| 7 | SimVLM | Accuracy | 85.15 | — | Unverified |
| 8 | X-VLM (base) | Accuracy | 84.76 | — | Unverified |
| 9 | BLIP-129M | Accuracy | 83.09 | — | Unverified |
| 10 | ALBEF (14M) | Accuracy | 82.55 | — | Unverified |