SOTAVerified

Visual Entailment

Visual Entailment (VE) - is a task consisting of image-sentence pairs whereby a premise is defined by an image, rather than a natural language sentence as in traditional Textual Entailment tasks. The goal is to predict whether the image semantically entails the text.

Papers

Showing 5156 of 56 papers

TitleStatusHype
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language ExplanationsCode0
Prompt Tuning for Generative Multimodal Pretrained ModelsCode0
Visual Entailment: A Novel Task for Fine-Grained Image UnderstandingCode0
VEglue: Testing Visual Entailment Systems via Object-Aligned Joint ErasingCode0
p-Laplacian Adaptation for Generative Pre-trained Vision-Language ModelsCode0
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning FrameworkCode0
Show:102550
← PrevPage 6 of 6Next →

No leaderboard results yet.