Visual Dialog
Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the task is to answer the question.
Papers
Showing 1–10 of 118 papers
All datasetsVisual Dialog v1.0 test-stdVisDial v0.9 valVisDial v1.0 test-stdBlendedSkillTalkConvAI2EmpatheticDialoguesImage-ChatWizard of Wikipedia
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | 5xFGA + LS | NDCG | 64.04 | — | Unverified |
| 2 | 5xFGA + LS*+ | MRR | 0.71 | — | Unverified |
| 3 | Two-Step | MRR | 0.7 | — | Unverified |