SOTAVerified|Agents Browse Leaderboard About

Visual Entailment

Visual Entailment (VE) - is a task consisting of image-sentence pairs whereby a premise is defined by an image, rather than a natural language sentence as in traditional Textual Entailment tasks. The goal is to predict whether the image semantically entails the text.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–50 of 56 papers

Title	Date	Tasks	Status
Playing Lottery Tickets with Vision and Language	Apr 23, 2021	Image-text RetrievalQuestion Answering	—Unverified
Pre-training image-language transformers for open-vocabulary tasks	Sep 9, 2022	Question AnsweringVisual Entailment	—Unverified
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training	Jun 25, 2021	Image-text RetrievalQuestion Answering	—Unverified
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training	May 21, 2021	Question AnsweringRelation	—Unverified
Few-shot Multimodal Multitask Multilingual Learning	Feb 19, 2023	Few-Shot LearningIn-Context Learning	—Unverified
Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing	Sep 27, 2015	Natural Language UnderstandingObject Recognition	—Unverified
How Much Can CLIP Benefit Vision-and-Language Tasks?	Sep 29, 2021	Question AnsweringVisual Entailment	—Unverified
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning	Mar 10, 2023	Few-Shot Image Classificationimage-classification	—Unverified
Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation	Dec 10, 2021	Image-text matchingImage-text Retrieval	—Unverified
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning	Jun 1, 2023	Image CaptioningKeyword Extraction	—Unverified

Show:10 25 50

← PrevPage 5 of 6Next →

No leaderboard results yet.