SOTAVerified

Visual Commonsense Reasoning

Papers

Showing 2130 of 65 papers

TitleStatusHype
Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning0
Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement LearningCode1
VASR: Visual Analogies of Situation RecognitionCode0
A survey on knowledge-enhanced multimodal learning0
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question AnsweringCode2
ILLUME: Rationalizing Vision-Language Models through Human InteractionsCode0
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization0
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language ModelsCode1
Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce Data Annotation Required in Visual Commonsense Tasks0
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.