SOTAVerified

Visual Entailment

Visual Entailment (VE) - is a task consisting of image-sentence pairs whereby a premise is defined by an image, rather than a natural language sentence as in traditional Textual Entailment tasks. The goal is to predict whether the image semantically entails the text.

Papers

Showing 5156 of 56 papers

TitleStatusHype
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training0
Playing Lottery Tickets with Vision and Language0
UNITER: Learning UNiversal Image-TExt Representations0
Visual Entailment: A Novel Task for Fine-Grained Image UnderstandingCode0
Visual Entailment Task for Visually-Grounded Language LearningCode0
Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing0
Show:102550
← PrevPage 3 of 3Next →

No leaderboard results yet.